XPages: Creating JSON String from large amount of document - java

I have been trying to create a Json String with a large amount document but using the below code but i get out of range or have to wait till up to 5min b4 the String is greated any idiea how i could optimise the code?
public String getJson() throws NotesException {
...
View view1 = ...;
ViewNavigator nav =view1.createViewNav();
ViewEntry ve = nav.getFirst();
JSONObject jsonMain = new JSONObject();
JSONArray items = new JSONArray();
Document docRoot = null
while (ve != null) {
docRoot= ve.getDocument();
items.add(getJsonDocAndChildren(docRoot));
ViewEntry veTemp = nav.getNextSibling(ve);
ve.recycle();
ve = docTemp;
}
jsonMain.put("identifier", "name");
jsonMain.put("label", "name");
jsonMain.put("items", items);
return jsonMain.toJSONString();
}
private JSONObject getJsonDocAndChildren(Document doc) throws NotesException {
String name = doc.getItemValueString("Name");
JSONObject jsonDoc = new JSONObject();
jsonDoc.put("name", name);
jsonDoc.put("field", doc.getItemValueString("field"));
DocumentCollection responses = doc.getResponses();
JSONArray children = new JSONArray();
getDocEntry(name,children);//this add all doc that has the fieldwith the same value name to children
if (responses.getCount() > 0) {
Document docResponse = responses.getFirstDocument();
while (docResponse != null) {
children.add(getJsonDocAndChildren(docResponse));
Document docTemp = responses.getNextDocument(docResponse);
docResponse.recycle();
docResponse = docTemp;
}
}
jsonDoc.put("children", children);
return jsonDoc;
}

There are a few things here, ranging from general efficiency to optimizations based on how you want to use the code.
The big one that would likely speed up your processing would be to do view operations only, without cracking open the documents. Since it looks like you want to get responses indiscriminately, you could add the response documents to the original view, with the "Show responses in hierarchy" option turned on. Then, if you have columns for Name and field in the view (and no "Show responses only") columns, then a nav.getNext() walk down the view will get them in turn. By storing the entry.getIndentLevel() value for each previous entry and comparing it at the start of the loop, you could "step" up and down the JSON tree: when the indent level increases by one, create a new array and add it to the existing object; when it decreases, step up one. It may be a little conceptually awkward at first, having to track previous states in a flat loop, but it'd be much more efficient.
Another option, also having the benefit of not having to crack open each individual document, would be to have a view of the response documents categorized by #Text($REF) and then making your recursive method look more like:
public static void walkTree(final View treeView, final String documentId) {
ViewNavigator nav = treeView.createViewNavFromCategory(documentId);
nav.setBufferMaxEntries(400);
for (ViewEntry entry : nav) {
// Do code here
walkTree(treeView, entry.getUniversalID(), callback);
}
}
(That example is using the OpenNTF Domino API, but, if you're not using that, you could down-convert the for loop to the legacy style)
As a minor improvement any time you traverse through ViewNavigators, you can set view.setAutoUpdate(false) and then nav.setBufferMaxEntries(400) to improve the internal caching.
And finally, depending on your needs - say, if you're outputting the JSON directly to an HTTP response's output stream - you could use JsonWriter instead of JsonObject to stream the content out instead of building a huge object in memory. I wrote about it with some simple code here: https://frostillic.us/blog/posts/EF0B875453B3CFC285257D570072F78F

You should first determine where the time is spent in your code. Maybe it is in doc.getResponses() or responses.getNextDocument() which you did not show here.
The obvious optimization which could be done within your code snippet is the following:
Basically you have some data structure called Document and build up a corresponding in memory JSON structure consisting of JSONObjects and JSONArrays. This JSON structure is then serialized to a String and returned.
Instead of building the JSON structure you could directly use a JsonWriter (don't know what JSON library you are using but there must be something like a JsonWriter). This avoids the memory allocations for the temporary JSON structure.
In getJson() you start:
StringWriter stringOut = new StringWriter();
JsonWriter out = new JsonWriter(stringOut);
and end
return stringOut.toString();
Now everywhere where you creating JSONObjects or JSONArrays you invoke corresponding writer methods. e.g.
private void getJsonDocAndChildren(Document doc, JsonWriter out) throws NotesException {
out.name("name");
out.value(doc.getItemValueString("Name"));
out.name("field");
out.value(doc.getItemValueString("field"));
DocumentCollection responses = doc.getResponses();
if (responses.getCount() > 0) {
Document docResponse = responses.getFirstDocument();
out.startArray();
...
Hope you get the idea.

Related

Java | GSON | Add JSON objects to excisting JSON-File

I have currently started a kind of diary project to teach myself how to code, which I write in Java. The project has a graphical interface which I realized with JavaFX.
I want to write data into a JSON file, which I enter into two text fields and a slider. Such a JSON entry should look like this:
{
"2019-01-13": {
"textfield1": "test1",
"textfield2": "test2",
"Slider": 2
}
}
I have already created a class in which the values can be passed and retrieved by the JSONWriter.
The class looks like this:
public class Entry {
private String date, textfield1, textfield2;
private Integer slider;
public String getDate() {
return date;
}
public void setDate(String date) {
this.date = date;
}
public String getTextfield1() {
return textfield1;
}
public void setTextfield1(String textfield1) {
this.textfield1 = textfield1;
}
public String getTextfield2() {
return textfield2;
}
public void setTextfield2(String textfield2) {
this.textfield2 = textfield2;
}
public Integer getSlider() {
return slider;
}
public void setSlider(Integer slider) {
this.slider= slider;
}
}
The code of the JSONWriter looks like this:
void json() throws IOException {
Gson gson = new GsonBuilder().setPrettyPrinting().create();
JsonWriter writer = new JsonWriter(new FileWriter("test.json",true));
JsonParser parser = new JsonParser();
Object obj = parser.parse(new FileReader("test.json"));
JsonObject jsonObject = (JsonObject) obj;
System.out.println(jsonObject);
writer.beginObject();
writer.name(entry.getDate());
writer.beginObject();
writer.name("textfield1").value(entry.getTextfield1());
writer.name("textfield2").value(entry.getTextfield2());
writer.name("Slider").value(entry.getSlider());
writer.endObject();
writer.endObject();
writer.close();
}
The date is obtained from the datepicker. Later I want to filter the data from the Json file by date and transfer the containing objects (textfield 1, textfiel2, slider) into the corresponding fields.
If possible, I would also like to try to overwrite the objects of a date. This means, if an entry of the date already exists and I want to change something in the entries, it should be replaced in the JSON file, so I can retrieve it later.
If you can recommend a better memory type for this kind of application, I am open for it. But it should also be compatible with databases later on. Later I would like to deal with databases as well.
So far I have no idea how to do this because I am still at the beginning of programming. I've been looking for posts that could cover the topic, but I haven't really found anything I understand.
You could start without JsonParser and JsonWriter and use Gson's fromJson(..) and toJson(..) because your current Json format is easily mapped as a map of entry POJOs.
Creating some complex implementation with JsonParser & JsonWriter might be more efficient for big amounts of data but in that point you already should have studied how to persist to db anyway.
POJOs are easy to manipulate and they can be later easily persisted to db - for example if you decide to use technology like JPA with only few annotations.
See below simple example:
#Test
public void test() throws IOException {
Gson gson = new GsonBuilder().setPrettyPrinting().create();
// Your current Json seems to be a map with date string as a key
// Create a corresponding type for gson to deserialize to
// correct generic types
Type type = new TypeToken<Map<String, Entry>>() {}.getType();
// Check this file name for your environment
String fileName = "src/test/java/org/example/diary/test.json";
Reader reader = new FileReader(new File(fileName));
// Read the whole diary to memory as java objects
Map<String, Entry> diary = gson.fromJson(reader, type);
// Modify one field
diary.get("2019-01-13").setTextfield1("modified field");
// Add a new date entry
Entry e = new Entry();
e.setDate("2019-01-14");
e.setScale(3);
e.setTextfield1("Dear Diary");
e.setTextfield1("I met a ...");
diary.put(e.getDate(), e);
// Store the new diary contents. Note that this one does not overwrite the
// original file but appends ".out.json" to file name to preserver the original
FileWriter fw = new FileWriter(new File(fileName + ".out.json"));
gson.toJson(diary, fw);
fw.close();
}
This should result test.json.out.json like:
{
"2019-01-13": {
"textfield1": "modified field",
"textfield2": "test2",
"Slider": 2
},
"2019-01-14": {
"date": "2019-01-14",
"textfield1": "Dear Diary",
"textfield2": "I met a ...",
"Slider": 3
}
}
Note that I also made little assumption about this:
// Just in case you meant to map "Slider" in Json as "scale"
#SerializedName("Slider")
private Integer scale;
I will give you general tips up to you to go deeper.
First of all, I recommend you this architecture that is common on web-applications or even desktop apps to get the front-end layer separately of back-end server:
Front-end (use Java Fx if you want). Tutorial: http://www.mastertheboss.com/jboss-frameworks/resteasy/rest-services-using-javafx-tutorial
Back-end (Java 1.8, Springboot, MySQL database). Example: there are tons of examples and tutorials using this stack, I recommend mykong or baeldung blogs.
The front-end will communicate to server over HTTP request through back-end REST API using JSON or XML format for messaging. In real life there are physically separated but just create 2 different java projects running on different ports.
For the back-end, just follow the tutorial to get up and running a REST API server. Set up MVC pattern: Controller layer, Service layer, Repository layer, model layer, dto layers, etc. For your specific model I recommend you the following:
selected_date: Date
inputs: Map of strings
size: Integer
On Front-end project with Java FX, just re-use the code you already wrote and add some CSS if you want. Use the components actions to call the back-end REST API to create, retrieve, update and delete your data from date-picker or whatever operation you want to do.
You will transform java objects into JSON strings permanently, I recommend you to use Gson library or Jackson library that do this in a direct way and it is not need to build the JsonObject manually. If you still want to write the JSON into a file, transform the java object into string (this is a string with the object written in JSON format) using the mentioned libraries, and then write the string into file. But I strongly believe it will more practice if you implement database.
Hope it helps

Upload documents into Watson's Retrieve & Rank service

I'm implementing a solution using Watson's Retrieve & Rank service.
When I use the tooling interface, I upload my documents and they appear as a list, where I can click on any of them to open up all the Titles that are inside the document ( Answer Units ), as you can see on the Picture 1 and Picture 2.
When I try to upload documents via Java, it wont recognize the documents, they get uploaded in parts ( Answer units as documents ), each part as a new document.
I would like to know how can I upload my documents as a entire document and not only parts of it?
Here's the codes for the upload function in Java:
public Answers ConvertToUnits(File doc, String collection) throws ParseException, SolrServerException, IOException{
DC.setUsernameAndPassword(USERNAME,PASSWORD);
Answers response = DC.convertDocumentToAnswer(doc).execute();
SolrInputDocument newdoc = new SolrInputDocument();
WatsonProcessing wp = new WatsonProcessing();
Collection<SolrInputDocument> newdocs = new ArrayList<SolrInputDocument>();
for(int i=0; i<response.getAnswerUnits().size(); i++)
{
String titulo = response.getAnswerUnits().get(i).getTitle();
String id = response.getAnswerUnits().get(i).getId();
newdoc.addField("title", titulo);
for(int j=0; j<response.getAnswerUnits().get(i).getContent().size(); j++)
{
String texto = response.getAnswerUnits().get(i).getContent().get(j).getText();
newdoc.addField("body", texto);
}
wp.IndexDocument(newdoc,collection);
newdoc.clear();
}
wp.ComitChanges(collection);
return response;
}
public void IndexDocument(SolrInputDocument newdoc, String collection) throws SolrServerException, IOException
{
UpdateRequest update = new UpdateRequest();
update.add(newdoc);
UpdateResponse addResponse = solrClient.add(collection, newdoc);
}
You can specify config options in this line:
Answers response = DC.convertDocumentToAnswer(doc).execute();
I think something like this should do the trick:
String configAsString = "{ \"conversion_target\":\"answer_units\", \"answer_units\": { \"selector_tags\": [] } }";
JsonParser jsonParser = new JsonParser();
JsonObject customConfig = jsonParser.parse(configAsString).getAsJsonObject();
Answers response = DC.convertDocumentToAnswer(doc, null, customConfig).execute();
I've not tried it out, so might not have got the syntax exactly right, but hopefully this will get you on the right track.
Essentially, what I'm trying to do here is use the selector_tags option in the config (see https://www.ibm.com/watson/developercloud/doc/document-conversion/customizing.shtml#htmlau for doc on this) to specify which tags the document should be split on. By specifying an empty list with no tags in, it results in it not being split at all - and coming out in a single answer unit as you want.
(Note that you can do this through the tooling interface, too - by unticking the "Split my documents up into individual answers for me" option when you upload the document)

Overwrite JsonNode using Jackson

I am using the Jackson streaming api to read in a json file like so:
// Go through json model and grab needed resources.
JsonFactory jsonfactory = new JsonFactory();
JsonParser jp = jsonfactory.createParser(fis);
JsonToken current;
current = jp.nextToken();
ObjectMapper mapper = new ObjectMapper();
if (current != JsonToken.START_OBJECT) {
System.out.println("Error: root should be object: quiting.");
return null;
}
while (jp.nextToken() != JsonToken.END_OBJECT) {
String fieldName = jp.getCurrentName();
// move from field name to field value
if ("Field1".equals(fieldName)) {
jp.nextToken();
JsonNode json = mapper.readTree(jp);
//Manipulate JsonNode
/*Want to write back into json file in place of
old object with manipulated node*/
}
else {
jp.skipChildren();
}
}
From the code above I am basically parsing the json file until I find the desired field I am looking for and then I read that into a JsonNode object, I then go through that JsonNode object and manipulate some of the data associated with it. My question is is there a way to delete that node out of the json file and write a newly created POJO into the file with the same field name in place of the old one? Everything I can find online about it involve reading the whole json file into a JsonNode which I would like to avoid as this file can be quite large.
In-place editing of a file like that is usually pretty complicated; a simpler approach is to create a new temporary file, and for the most part just copy what you're writing until you hit the conditions to modify what's going to the new one.
Then at the end you could delete the original file and rename the temporary one to "replace" it; Unless disk space is an issue though, I personally like keeping the original source around (especially in automated systems) for troubleshooting

Having trouble extracting values from JSON

Examples:
{"name":"tv.twitch:twitch:5.16"}
{"name":"tv.twitch:twitch-external-platform:4.5","extract":{"exclude":["META-INF/"]},"natives":{"windows":"natives-windows-${arch}"},"rules":[{"os":{"name":"windows"},"action":"allow"}]}
These lines came from a JSONArray, I'd like to extract the "natives" portion. The problem is, not all items in the JSONArray have the "natives" value. Here is my current code to extract the "name" value
JSONObject json = new JSONObject(readUrl(url.toString()));
JSONArray jsonArray = json.getJSONArray("libraries");
ArrayList<String> libraries = new ArrayList<String>();
for (int i = 0; i < jsonArray.length(); i++) {
JSONObject next = jsonArray.getJSONObject(i);
String lib = next.getString("name");
libraries.add(lib);
}
I'm not exactly sure about this since I am new to java/JSON parsing, but would an object in the array without the "natives" value cause the program to end?
You can use has method from JSONObject to determine if it contains specified key or not.
Determine if the JSONObject contains a specific key.
In your case you can do like this:
JSONObject json = new JSONObject(readUrl(url.toString()));
if(json.has("natives")) {
//Logic to extract natives
} else {
//Logic to extract without natives
}
I think this simple lines should suffice for your requirement. See the API:here
You seem to want to extract content at JSON Pointers /name and /extract/natives/windows.
In this case, using this library (which depends on Jackson), it is as simple as:
// All of these are thread safe
private static final ObjectReader READER = JacksonUtils.getReader();
private static final JsonPointer NAME_POINTER = JsonPointer.of("name");
private static final JsonPointer WINDOWS_POINTER
= JsonPointer.of("extract", "native", "windows");
// Fetch content from URL
final JsonNode content = READER.readTree(url.getInputStream());
// Get content at pointers, if any
final JsonNode nameNode = NAME_POINTER.path(content);
final JsonNode windowsNode = WINDOWS_POINTER.path(content);
Then, to check if a node actually exists, check against .isMissingNode():
if (windowsNode.isMissingNode())
// deal with no windows content
Alternatively, use .get() instead of .path() and check for null instead.

Streaming a json element

Let's say I have a json that looks like this:
{"body":"abcdef","field":"fgh"}
Now suppose the value of the 'body' element is huge(~100 MB or more). I would like to stream out the value of the body element instead of storing it in a String.
How can I do this? Is there any Java library I could use for this?
This is the line of code that fails with an OutOfMemoryException when a large json value comes in:
String inputStreamString = (String) JsonPath.read(textValue.toString(), "$.body");
'textValue' here is a hadoop.io.Text object.
I'm assuming that the OutOfMemory error occurs because we try to do method calls like toString() (which creates a new object), and JsonPath.read(), all of which are done in-memory. I need to know if there is an approach I could take while handling large-sized textValue objects.
Please let me know if you need additional info.
JsonSurfer is good for processing very large JSON data with selective extraction.
Example how to surf in JSON data collecting matched values in the listeners:
BufferedReader reader = new BufferedReader(new FileReader(jsonFile));
JsonSurfer surfer = new JsonSurfer(GsonParser.INSTANCE, GsonProvider.INSTANCE);
SurfingConfiguration config = surfer.configBuilder().bind("$.store.book[*]", new JsonPathListener() {
#Override
public void onValue(Object value, ParsingContext context) throws Exception {
JsonObject book = (JsonObject) value;
}
}).build();
surfer.surf(reader, config);
Jackson offers a streaming API for generating and processing JSON data.

Categories

Resources