How to make minor edits to an XML file in Java

How to make minor edits to an XML file in Java - java

I am trying to change a single value in a large (5mb) XML file. I always know the value will be in the first 10 lines, therefore I do not need to read in 99% of the file. Yet it seems doing a partial XML read in Java is quite tricky.
In this picture you can see the single value I need to access.
I have read a lot about XML in Java and the best practices of handling it. However, in this case I am unsure of what the best approach would be - A DOM, STAX or SAX parser all seem to have different best use case scenarios - and I am not sure which would best suit this problem. Since all I need to do is edit one value.
Perhaps, I shouldn't even use an XML parser and just go with regex, but it seem like it is a pretty bad idea to use regex on XML
Hoping someone could point me in the right direction,
Thanks!

I would choose DOM over SAX or StAX simply for the (relative) simplicity of the API. Yes, there is some boilerplate code to get the DOM populated, but once you get past that it is fairly straight-forward.
Having said that, if your XML source is 100s or 1000s of megabytes, one of the streaming APIs would be better suited. As it is, 5MB is not what I would consider a large dataset, so go ahead and use DOM and call it a day:
import java.io.File;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;
public class ChangeVersion
{
public static void main(String[] args)
throws Exception
{
if (args.length < 3) {
System.err.println("Usage: ChangeVersion <input> <output> <new version>");
System.exit(1);
}
File inputFile = new File(args[0]);
File outputFile = new File(args[1]);
int updatedVersion = Integer.parseInt(args[2], 10);
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = domFactory.newDocumentBuilder();
Document doc = docBuilder.parse(inputFile);
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
XPathExpression expr = xpath.compile("/PremiereData/Project/#Version");
NodeList versionAttrNodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < versionAttrNodes.getLength(); i++) {
Attr versionAttr = (Attr) versionAttrNodes.item(i);
versionAttr.setNodeValue(String.valueOf(updatedVersion));
}
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(new DOMSource(doc), new StreamResult(outputFile));
}
}

You can use the StAX parser to write the XML as you read it. While doing this you can replace the content as it parses. Using a StAX parser will only contain parts of the xml in memory at any given time.
public static void main(String [] args) throws Exception {
final String newProjectId = "888";
File inputFile = new File("in.xml");
File outputFile = new File("out.xml");
System.out.println("Reading " + inputFile);
System.out.println("Writing " + outputFile);
XMLInputFactory inFactory = XMLInputFactory.newInstance();
XMLEventReader eventReader = inFactory.createXMLEventReader(new FileInputStream(inputFile));
XMLOutputFactory factory = XMLOutputFactory.newInstance();
XMLEventWriter writer = factory.createXMLEventWriter(new FileWriter(outputFile));
XMLEventFactory eventFactory = XMLEventFactory.newInstance();
boolean useExistingEvent; // specifies if we should use the event right from the reader
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
useExistingEvent = true;
// look for our Project element
if(event.getEventType() == XMLEvent.START_ELEMENT) {
// read characters
StartElement elemEvent = event.asStartElement();
Attribute attr = elemEvent.getAttributeByName(QName.valueOf("ObjectID"));
// check to see if this is the project we want
// TODO: put what logic you want here
if("Project".equals(elemEvent.getName().getLocalPart()) && attr != null && attr.getValue().equals("1")) {
Attribute versionAttr = elemEvent.getAttributeByName(QName.valueOf("Version"));
// we need to make a list of new attributes for this element which doesnt include the Version a
List<Attribute> newAttrs = new ArrayList<>(); // new list of attrs
Iterator<Attribute> existingAttrs = elemEvent.getAttributes();
while(existingAttrs.hasNext()) {
Attribute existing = existingAttrs.next();
// copy over everything but version attribute
if(!existing.getName().getLocalPart().equals("Version")) {
newAttrs.add(existing);
}
}
// add our new attribute for projectId
newAttrs.add(eventFactory.createAttribute(versionAttr.getName(), newProjectId));
// were using our own event instead of the existing one
useExistingEvent = false;
writer.add(eventFactory.createStartElement(elemEvent.getName(), newAttrs.iterator(), elemEvent.getNamespaces()));
}
}
// persist the existing event.
if(useExistingEvent) {
writer.add(event);
}
}
writer.close();
}

Related

Parsing HTML content from XML file

<xbrli:xbrl xmlns:aoi="http://www.aointl.com/20160331" xmlns:country="http://xbrl.sec.gov/country/2016-01-31" xmlns:currency="http://xbrl.sec.gov/currency/2016-01-31" xmlns:dei="http://xbrl.sec.gov/dei/2014-01-31" xmlns:exch="http://xbrl.sec.gov/exch/2016-01-31" xmlns:invest="http://xbrl.sec.gov/invest/2013-01-31" xmlns:iso4217="http://www.xbrl.org/2003/iso4217" xmlns:link="http://www.xbrl.org/2003/linkbase" xmlns:naics="http://xbrl.sec.gov/naics/2011-01-31" xmlns:nonnum="http://www.xbrl.org/dtr/type/non-numeric" xmlns:num="http://www.xbrl.org/dtr/type/numeric" xmlns:ref="http://www.xbrl.org/2006/ref" xmlns:sic="http://xbrl.sec.gov/sic/2011-01-31" xmlns:stpr="http://xbrl.sec.gov/stpr/2011-01-31" xmlns:us-gaap="http://fasb.org/us-gaap/2016-01-31" xmlns:us-roles="http://fasb.org/us-roles/2016-01-31" xmlns:us-types="http://fasb.org/us-types/2016-01-31" xmlns:utreg="http://www.xbrl.org/2009/utr" xmlns:xbrldi="http://xbrl.org/2006/xbrldi" xmlns:xbrldt="http://xbrl.org/2005/xbrldt" xmlns:xbrli="http://www.xbrl.org/2003/instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<link:schemaRef xlink:href="aoi-20160331.xsd" xlink:type="simple"/>
<xbrli:context id="FD2016Q4YTD">
<xbrli:entity>
<xbrli:identifier scheme="http://www.sec.gov/CIK">0000939930</xbrli:identifier>
</xbrli:entity>
<xbrli:period>
<xbrli:startDate>2015-04-01</xbrli:startDate>
<xbrli:endDate>2016-03-31</xbrli:endDate>
</xbrli:period>
</xbrli:context>
<aoi:OtherIncomeAndExpensePolicyTextBlock contextRef="FD2016Q4YTD" id="Fact-F51C7616E17E5B8B0B770D410BBF5A3E">
<div style="font-family:Times New Roman;font-size:10pt;"><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;font-weight:bold;">Other Income (Expense)</font></div><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;"></font></div></div>
</aoi:OtherIncomeAndExpensePolicyTextBlock>
</xbrli:xbrl>
This is My XML[XBRL], i need to parse this. This xml is my input and i don't know whether its a valid or not but in need output like this :
<div style="font-family:Times New Roman;font-size:10pt;"><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;font-weight:bold;">Other Income (Expense)</font></div><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;"></font></div></div>
Please someone share me the knowledge for this problem i am facing from last two weeks.
this is the code i am using
File fXmlFile = new File("/home/devteam-user1/Desktop/ky/UnitTesting.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
XPath xPath = XPathFactory.newInstance().newXPath();
final String DIV_UNDER_ROOT = "/*/aoi";
NodeList divList = (NodeList)xPath.compile(DIV_UNDER_ROOT)
.evaluate(doc, XPathConstants.NODESET);
System.out.println(divList.getLength());
for (int i = 0; i < divList.getLength() ; i++) { // just in case there is more than one
Node divNode = divList.item(i);
System.out.println(nodeToString(divNode));
//nodeToString method below
private static String nodeToString(Node node) throws Exception
{
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
StreamResult result = new StreamResult(new StringWriter());
transformer.transform(new DOMSource(node), result);
return result.getWriter().toString();
}

this works well for me
public static void main(String[] args) throws IOException {
FileInputStream fis = new FileInputStream("yourfile.xml");
Document doc = Jsoup.parse(Utils.streamToString(fis));
System.out.println(doc.select("aoi|OtherIncomeAndExpensePolicyTextBlock").html().toString());
}

Your main issue lies with
final String DIV_UNDER_ROOT = "/*/aoi";
Which is an XPath expression that matches "any node 2 levels under the root, which has a local name of aoi and no namespace". This is not what you want.
You want to match any contents of a node that is two levels deep, whose namespace is aliased by "aoi" (which means it belongs to the "http://www.aointl.com/20160331" namespace), and whose local name is "OtherIncomeAndExpensePolicyTextBlock".
Matching namespaces in XPath in Java is quiet cumbersome (see XPath with namespace in Java and How to query XML using namespaces in Java with XPath?), but long story short, you could try this way instead :
final String DIV_UNDER_ROOT = "//*[local-name()='OtherIncomeAndExpensePolicyTextBlock' and namespace-uri()='http://www.aointl.com/20160331']/*";
This will only work if your DocumentBuilderFactory is made namespace aware, so you should make sure by configuring it like so above :
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);

Pretty print XML in java 8

I have an XML file stored as a DOM Document and I would like to pretty print it to the console, preferably without using an external library. I am aware that this question has been asked multiple times on this site, however none of the previous answers have worked for me. I am using java 8, so perhaps this is where my code differs from previous questions? I have also tried to set the transformer manually using code found from the web, however this just caused a not found error.
Here is my code which currently just outputs each xml element on a new line to the left of the console.
import java.io.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class Test {
public Test(){
try {
//java.lang.System.setProperty("javax.xml.transform.TransformerFactory", "org.apache.xalan.xsltc.trax.TransformerFactoryImpl");
DocumentBuilderFactory dbFactory;
DocumentBuilder dBuilder;
Document original = null;
try {
dbFactory = DocumentBuilderFactory.newInstance();
dBuilder = dbFactory.newDocumentBuilder();
original = dBuilder.parse(new InputSource(new InputStreamReader(new FileInputStream("xml Store - Copy.xml"))));
} catch (SAXException | IOException | ParserConfigurationException e) {
e.printStackTrace();
}
StringWriter stringWriter = new StringWriter();
StreamResult xmlOutput = new StreamResult(stringWriter);
TransformerFactory tf = TransformerFactory.newInstance();
//tf.setAttribute("indent-number", 2);
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.transform(new DOMSource(original), xmlOutput);
java.lang.System.out.println(xmlOutput.getWriter().toString());
} catch (Exception ex) {
throw new RuntimeException("Error converting to String", ex);
}
}
public static void main(String[] args){
new Test();
}
}

In reply to Espinosa's comment, here is a solution when "the original xml is not already (partially) indented or contain new lines".
Background
Excerpt from the article (see References below) inspiring this solution:
Based on the DOM specification, whitespaces outside the tags are perfectly valid and they are properly preserved. To remove them, we can use XPath’s normalize-space to locate all the whitespace nodes and remove them first.
Java Code
public static String toPrettyString(String xml, int indent) {
try {
// Turn xml string into a document
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder()
.parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));
// Remove whitespaces outside tags
document.normalize();
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
document,
XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); ++i) {
Node node = nodeList.item(i);
node.getParentNode().removeChild(node);
}
// Setup pretty print options
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setAttribute("indent-number", indent);
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
// Return pretty print xml string
StringWriter stringWriter = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(stringWriter));
return stringWriter.toString();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
Sample usage
String xml = "<root>" + //
"\n " + //
"\n<name>Coco Puff</name>" + //
"\n <total>10</total> </root>";
System.out.println(toPrettyString(xml, 4));
Output
<root>
<name>Coco Puff</name>
<total>10</total>
</root>
References
Java: Properly Indenting XML String published on MyShittyCode
Save new XML node to file

I guess that the problem is related to blank text nodes (i.e. text nodes with only whitespaces) in the original file. You should try to programmatically remove them just after the parsing, using the following code. If you don't remove them, the Transformer is going to preserve them.
original.getDocumentElement().normalize();
XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//text()[normalize-space(.) = '']");
NodeList blankTextNodes = (NodeList) xpath.evaluate(original, XPathConstants.NODESET);
for (int i = 0; i < blankTextNodes.getLength(); i++) {
blankTextNodes.item(i).getParentNode().removeChild(blankTextNodes.item(i));
}

This works on Java 8:
public static void main (String[] args) throws Exception {
String xmlString = "<hello><from>ME</from></hello>";
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(new InputSource(new StringReader(xmlString)));
pretty(document, System.out, 2);
}
private static void pretty(Document document, OutputStream outputStream, int indent) throws Exception {
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
if (indent > 0) {
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", Integer.toString(indent));
}
Result result = new StreamResult(outputStream);
Source source = new DOMSource(document);
transformer.transform(source, result);
}

I've written a simple class for for removing whitespace in documents - supports command-line and does not use DOM / XPath.
Edit: Come to think of it, the project also contains a pretty-printer which handles existing whitespace:
PrettyPrinter prettyPrinter = PrettyPrinterBuilder.newPrettyPrinter().ignoreWhitespace().build();

Underscore-java has static method U.formatXml(string). I am the maintainer of the project. Live example
import com.github.underscore.U;
public class MyClass {
public static void main(String args[]) {
String xml = "<root>" + //
"\n " + //
"\n<name>Coco Puff</name>" + //
"\n <total>10</total> </root>";
System.out.println(U.formatXml(xml));
}
}
Output:
<root>
<name>Coco Puff</name>
<total>10</total>
</root>

I didn't like any of the common XML formatting solutions because they all remove more than 1 consecutive new line character (for some reason, removing spaces/tabs and removing new line characters are inseparable...). Here's my solution, which was actually made for XHTML but should do the job with XML as well:
public String GenerateTabs(int tabLevel) {
char[] tabs = new char[tabLevel * 2];
Arrays.fill(tabs, ' ');
//Or:
//char[] tabs = new char[tabLevel];
//Arrays.fill(tabs, '\t');
return new String(tabs);
}
public String FormatXHTMLCode(String code) {
// Split on new lines.
String[] splitLines = code.split("\\n", 0);
int tabLevel = 0;
// Go through each line.
for (int lineNum = 0; lineNum < splitLines.length; ++lineNum) {
String currentLine = splitLines[lineNum];
if (currentLine.trim().isEmpty()) {
splitLines[lineNum] = "";
} else if (currentLine.matches(".*<[^/!][^<>]+?(?<!/)>?")) {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
++tabLevel;
} else if (currentLine.matches(".*</[^<>]+?>")) {
--tabLevel;
if (tabLevel < 0) {
tabLevel = 0;
}
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
} else if (currentLine.matches("[^<>]*?/>")) {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
--tabLevel;
if (tabLevel < 0) {
tabLevel = 0;
}
} else {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
}
}
return String.join("\n", splitLines);
}
It makes one assumption: that there are no <> characters except for those that comprise the XML/XHTML tags.

Create xml file :
new FileInputStream("xml Store - Copy.xml") ;// result xml file format incorrect !
so that, when parse the content of the given input source as an XML document
and return a new DOM object.
Document original = null;
...
original.parse("data.xml");//input source as an XML document

How to retrieve XML including tags using the DOM parser

I am using org.w3c.dom to parse an XML file. Then I need to return the ENTIRE XML for a specific node including the tags, not just the values of the tags. I'm using the NodeList because I need to count how many records are in the file. But I also need to read the file wholesale from the beginning and then write it out to a new XML file. But my current code only prints the value of the node, but not the node itself. I'm stumped.
public static void main(String[] args) {
try {
File fXmlFile = new File (args[0]);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList listOfRecords = doc.getElementsByTagName("record");
int totalRecords = listOfRecords.getLength();
System.out.println("Total number of records : " + totalRecords);
int amountToSplice = queryUser();
for (int i = 0; i < amountToSplice; i++) {
String stringNode = listOfRecords.item(i).getTextContent();
System.out.println(stringNode);
}
} catch (Exception e) {
e.printStackTrace();
}
}

getTextContent() will only "return the text content of this node and its descendants" i.e. you only get the content of the 'text' type nodes. When parsing XML it's good to remember there are several different types of node, see XML DOM Node Types.
To do what you want, you could create a utility method like this...
public static String nodeToString(Node node)
{
Transformer t = TransformerFactory.newInstance().newTransformer();
t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
t.setOutputProperty(OutputKeys.INDENT, "yes");
StringWriter sw = new StringWriter();
t.transform(new DOMSource(node), new StreamResult(sw));
return sw.toString();
}
Then loop and print like this...
for (int i = 0; i < amountToSplice; i++)
System.out.println(nodeToString(listOfRecords.item(i)));

get node raw text

How get node value with its children nodes? For example I have following node parsed into dom Document instance:
<root>
<ch1>That is a text with <value name="val1">value contents</value></ch1>
</root>
I select ch1 node using xpath. Now I need to get its contents, everything what is containing between <ch1> and </ch1>, e.g. That is a text with <value name="val1">value contents</value>.
How can I do it?

I have found the following code snippet that uses transformation, it gives almost exactly what I want. It is possible to tune result by changing output method.
public static String serializeDoc(Node doc) {
StringWriter outText = new StringWriter();
StreamResult sr = new StreamResult(outText);
Properties oprops = new Properties();
oprops.put(OutputKeys.METHOD, "xml");
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = null;
try {
t = tf.newTransformer();
t.setOutputProperties(oprops);
t.transform(new DOMSource(doc), sr);
} catch (Exception e) {
System.out.println(e);
}
return outText.toString();
}

If this is server side java (ie you do not need to worry about it running on other jvm's) and you are using the Sun/Oracle JDK, you can do the following:
import com.sun.org.apache.xml.internal.serialize.OutputFormat;
import com.sun.org.apache.xml.internal.serialize.XMLSerializer;
...
Node n = ...;
OutputFormat outputFormat = new OutputFormat();
outputFormat.setOmitXMLDeclaration(true);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
XMLSerializer ser = new XMLSerializer(baos, outputFormat);
ser.serialize(n);
System.out.println(new String(baos.toByteArray()));
Remember to ensure your ultimate conversion to string may need to take an encoding parameter if the parsed xml dom has its text nodes in a different encoding than your platforms default one or you'll get garbage on the unusual characters.

You could use jOOX to wrap your DOM objects and get many utility functions from it, such as the one you need. In your case, this will produce the result you need (using css-style selectors to find <ch1/>:
String xml = $(document).find("ch1").content();
Or with XPath as you did:
String xml = $(document).xpath("//ch1").content();
Internally, jOOX will use a transformer to generate that output, as others have mentioned

As far as I know, there is no equivalent of innerHTML in Document. DOM is meant to hide the details of the markup from you.
You can probably get the effect you want by going through the children of that node. Suppose for example that you want to copy out the text, but replace each "value" tag with a programmatically supplied value:
HashMap<String, String> values = ...;
StringBuilder str = new StringBuilder();
for(Element child = ch1.getFirstChild; child != null; child = child.getNextSibling()) {
if(child.getNodeType() == Node.TEXT_NODE) {
str.append(child.getTextContent());
} else if(child.getNodeName().equals("value")) {
str.append(values.get(child.getAttributes().getNamedItem("name").getTextContent()));
}
}
String output = str.toString();

Print empty XML element as opening tag, closing tag

Is there any way to print empty elements like <tag></tag> rather than <tag /> using org.w3c.dom? I'm modifying XML files that need to be diff'ed against old versions of themselves for review.
If it helps, the code that writes the XML to the file:
TransformerFactory t = TransformerFactory.newInstance();
Transformer transformer = t.newTransformer();
DOMSource source = new DOMSource(doc);
StringWriter xml = new StringWriter();
StreamResult result = new StreamResult(xml);
transformer.transform(source, result);
File f = new File("output.xml");
FileWriter writer = new FileWriter(f);
BufferedWriter out = new BufferedWriter(writer);
out.write(xml.toString());
out.close();
Thanks.

You may want to consider converting both the old and the new XML file to Canonical XML - http://en.wikipedia.org/wiki/Canonical_XML - before comparing them with e.g. diff.
James Clark has a small Java program to do so on http://www.jclark.com/xml/

I'm assuming the empty elements are actually ELEMENT_NODEs with no children within the document. Try adding an empty text node to them instead. That may trick the writer into believing there is a text node there, so it will write it out as if there was one. But the text node won't output anything because it is an empty string.
Calling this method with the document as both parameters should do the trick:
private static void fillEmptyElementsWithEmptyTextNodes(
final Document doc, final Node root)
{
final NodeList children = root.getChildNodes();
if (root.getType() == Node.ELEMENT_NODE &&
children.getLength() == 0)
{
root.appendChild(doc.createTextNode(""));
}
// Recurse to children.
for(int i = 0; i < children.getLength(); ++i)
{
final Node child = children.item(i);
fillEmptyElementsWithEmptyTextNodes(doc, child);
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to make minor edits to an XML file in Java - java

Related

Parsing HTML content from XML file

Pretty print XML in java 8

How to retrieve XML including tags using the DOM parser

get node raw text

Print empty XML element as opening tag, closing tag

Categories

Resources