XML writing in java creating unnecessary new lines - java

I am using w3c DOM to write xml file.
when i used to create first child node no trouble occurs.
For the 2nd time if i'm appending a new node in pre existing file it creates unwanted new lines in previous nodes and the new lines kept increasing everytime when i used to insert new nodes.
Here is my code...
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File("D:\\TestXml.xml"));
Element rootElement = doc.getDocumentElement();
Element supercar = doc.createElement("supercars");
rootElement.appendChild(supercar);
Element carname = doc.createElement("carname");
carname.appendChild(doc.createTextNode("Ferrari 103"));
supercar.appendChild(carname);
Element carname1 = doc.createElement("carname");
carname1.appendChild(doc.createTextNode("Ferrari 204"));
supercar.appendChild(carname1);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("D:\\TestXml.xml"));
transformer.transform(source, result);
And here is the Generated File.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cars>
<supercars>
<carname>Ferrari 101</carname>
<carname>Ferrari 202</carname>
</supercars>
<supercars>
<carname>Ferrari 103</carname>
<carname>Ferrari 204</carname>
</supercars>
</cars>
The Code above is used to append the 2nd node for the 1'st time the generated file haves no extra new lines.
And if add 10 new nodes the file haves so many unnecessary new lines resulting in more than 300 lines.
Also the file size got increased.
I cannot able to come to a conclusion that why this is occurring.
The Problem occurring for every new node insertion.
Any suggestions will be really helpful.

Consider running the identity transform XSLT where its <xsl:strip-space> removes such line breaks and spaces between nodes. You can easily incorporate XSLT in your existing code:
XSLT (save below as .xsl file, copies entire document as is)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Java
import javax.xml.transform.stream.StreamSource;
...
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File("D:\\TestXml.xml"));
Element rootElement = doc.getDocumentElement();
Element supercar = doc.createElement("supercars");
rootElement.appendChild(supercar);
Element carname = doc.createElement("carname");
carname.appendChild(doc.createTextNode("Ferrari 103"));
supercar.appendChild(carname);
Element carname1 = doc.createElement("carname");
carname1.appendChild(doc.createTextNode("Ferrari 204"));
supercar.appendChild(carname1);
Source xslt = new StreamSource(new File("C:\\Path\\To\\Style.xsl")); // LOAD STYLESHEET
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer(xslt); // APPLY XSLT
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("D:\\TestXml.xml"));
transformer.transform(source, result);

Related

Extract all the recurring element of XML using java

My XML looks like below and I need to extract multiple ID element in
an output xml:-
<?xml version="1.0" encoding="utf-8"?>
<Stock>
<PIdentification>
<CatalogVersion></CatalogVersion>
<AccountID></AccountID>
<CustomerId></CustomerId>
</ProviderIdentification>
<Product>
<ArticleName>Monitors</ArticleName>
<BaseUnit></BaseUnit>
<Notes></Notes>
<ID>11f13e2e-ae97-45b5-a9a9-23fa7f6bb767</ID>
<ID>b22834c0-a570-4e6b-97c3-5067a14d118d</ID>
<ID>ed458593-5e1a-4dc1-94f0-a66eeef2dd79</ID>
<ID>d25584a9-1db2-48cf-9a70-9b81e5a7e7f2</ID>
</Product>
</Stock>
I have used "Nodelist" to extract "ID" but I am getting just one element
and not all 4, below is the part of the code:-
{
Node IDNode = element.getElementsByTagName("ID").item(0);
IDXml = toStringXml(IDNode , true);
}
I am not able to reiterate for look to get all the IDs, please let me
know how to get all ID, any help is appreciated.
private static String toStringXml(Node elt, boolean xmlDeclaration)
throws TransformerException {
TransformerFactory transformerFactory = TransformerFactory
.newInstance();
Transformer transformer = transformerFactory.newTransformer();
if(xmlDeclaration)
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
DOMSource source = new DOMSource(elt);
StreamResult result = new StreamResult(new StringWriter());
transformer.transform(source, result);
return result.getWriter().toString();
}
You got all id's but you are only looking at first item with .item(id).
Method getElementsByTageName("ID") returns you NodeList so you can got trough all ids for example like that:
File xmlFile = new File("src/main/resources/example.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document element = dBuilder.parse(xmlFile);
NodeList list = element.getElementsByTagName("ID");
for (int i = 0; i < list.getLength(); i++){
Node specificIDNode = list.item(i);
System.out.println(specificIDNode.getTextContent());
}
You have malformed XML displayed in your question. Unclosed <PIdentification> tag on the row 3.
Correct XML code should look like that:
<?xml version="1.0" encoding="utf-8"?>
<Stock>
<ProviderIdentification>
<CatalogVersion></CatalogVersion>
<AccountID></AccountID>
<CustomerId></CustomerId>
</ProviderIdentification>
<Product>
<ArticleName>Monitors</ArticleName>
<BaseUnit></BaseUnit>
<Notes></Notes>
<ID>11f13e2e-ae97-45b5-a9a9-23fa7f6bb767</ID>
<ID>b22834c0-a570-4e6b-97c3-5067a14d118d</ID>
<ID>ed458593-5e1a-4dc1-94f0-a66eeef2dd79</ID>
<ID>d25584a9-1db2-48cf-9a70-9b81e5a7e7f2</ID>
</Product>
</Stock>
And in this case, code, provided by #Penguin74 displays all IDS from your xml (check the picture below).

Java XML removing node

I am trying to remove a node from a large xml file. With this code the tags of the other elements are altered as well. I was hoping someone could explain why or how to fix it.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document document = dbf.newDocumentBuilder().parse(new File(filePath)); //filePath - source file
/*while (document.getElementsByTagName("IMFile").getLength() != 0){
//Loop until all childs are removed
Element element = (Element) document.getElementsByTagName("IMFile").item(0);
element.getParentNode().removeChild(element);
}*/
//Test for first appearance
Element element = (Element) document.getElementsByTagName("IMFile").item(0);
element.getParentNode().removeChild(element);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
t.transform(new DOMSource(document), new StreamResult(new File(filePath+"_New"))); //destination
It changes positions of the xml such as:
<Attribute id="7" value="1920" name="width"/> to <Attribute id="7" name="width" value="1920"/>
Also it cuts off some open or end tags:
<PowerPointFilename></PowerPointFilename> to <PowerPointFilename/>
You can use a SAX transformer to modify an XML document while preserving attribute order:
public static void main(String[] args) throws IOException, TransformerException, SAXException {
XMLReader reader = XMLReaderFactory.createXMLReader();
TransformerFactory tf = TransformerFactory.newInstance();
// Load the transformer definition from the file strip.xsl:
Transformer t = tf.newTransformer(new SAXSource(reader, new InputSource(new FileInputStream("strip.xsl"))));
// Transform the file test.xml to stdout:
t.transform(new SAXSource(reader, new InputSource(new FileInputStream("test.xml"))), new StreamResult(System.out));
}
Here's an XSL transform to strip IMFile elements:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- Copy -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- Strip IMFile elements -->
<xsl:template match="IMFile"/>
</xsl:stylesheet>

Editing xml content in java and passing it as string, using node preferably

I've a xml document, which will be used as a template
<?xml version="1.0" encoding="UTF-8" standalone="no"?><entry xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"><content type="application/xml"><m:properties><d:AccountEnabled>true</d:AccountEnabled><d:DisplayName>SampleAppTestj5</d:DisplayName><d:MailNickname>saTestj5</d:MailNickname><d:Password>Qwerty1234</d:Password><d:UserPrincipalName>saTestj5#identropy.us</d:UserPrincipalName></m:properties></content></entry>
I'm calling it in java using this code where payLoadXML.xml has the above content.
"InputStream is = getClass().getClassLoader().getResourceAsStream("/payLoadXML.xml");"
Now I'm trying to edit the tag values for example changing the from "saTestj5" to "saTestj6" and then converting this entire xml and storing it in xml. Can anyone tell me how can I achieve this? I was told this can be done by using "Node" is it possible?
Use jaxb or sax parsers convert into object by using getter method and change the object and convert back to xml
try this
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = null;
docBuilder = docFactory.newDocumentBuilder();
Document doc = null;
InputStream is = getClass().getClassLoader().getResourceAsStream("/payLoadXML.xml");
doc = docBuilder.parse(is);
Node staff = doc.getElementsByTagName("m:properties").item(0);
Text givenNameValue = doc.createTextNode("abc");
Element givenName = doc.createElement("d:GivenName");
givenName.appendChild(givenNameValue);
staff.appendChild(givenName);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = null;
transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
transformer.transform(source, result);

Add xml-stylesheet and get standalone = yes

I added the solution to the code below.
The code at the bottom is what I have. I removed the creation of all tags.
At the top in the xml file I get.<?xml version="1.0" encoding="UTF-8" standalone="no"?> Note that standalone is no, even thou I have it set to yes.
The first question: How do I get standalone = yes?
I would like to add <?xml-stylesheet type="text/xsl" href="my.stylesheet.xsl"?> at line two in the xml file.
Second question: How do I do that?
Some useful links? Anything?
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();
doc.setXmlStandalone(true);
ProcessingInstruction pi = doc.createProcessingInstruction("xml-stylesheet", "type=\"text/xsl\" href=\"my.stylesheet.xsl\"");
Element root = doc.createElement("root-element");
doc.appendChild(root);
doc.insertBefore(pi, root);
<cut>
TransformerFactory transfac = TransformerFactory.newInstance();
transfac.setAttribute("indent-number", new Integer(2));
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
trans.setOutputProperty(OutputKeys.STANDALONE, "yes");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
trans.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "name");
FileOutputStream fout = new FileOutputStream(filepath);
BufferedOutputStream bout= new BufferedOutputStream(fout);
trans.transform(new DOMSource(doc), new StreamResult(new OutputStreamWriter(bout, "utf-8")));
I added
doc.setXmlStandalone(true);
ProcessingInstruction pi = doc.createProcessingInstruction("xml-stylesheet", "type=\"text/xsl\" href=\"my.stylesheet.xsl\"");`
before the cut and
doc.insertBefore(pi, root);
right after the root element was appended to the doc.
in my code, I wrote :
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
document.setXmlStandalone(true);
TransformerFactory tfactory = TransformerFactory.newInstance();
Transformer transformer = tfactory.newTransformer();
transformer.setOutputProperty(OutputKeys.STANDALONE, "yes");
output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

Java XML Output - proper indenting for child items

I'd like to serialize some simple data model into xml, I've been using the standard java.org.w3c -related code (see below), the indentation is better than no "OutputKeys.INDENT", yet there is one little thing that remains - proper indentation for child elements.
I know that this has been asked before on stackoverflow , yet that configuration did not work for me, this is the code I'm using :
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.newDocument();
doc = addItemsToDocument(doc);
// The addItemsToDocument method adds childElements to the document.
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setAttribute("indent-number", new Integer(4));
// switching to setAttribute("indent-number", 4); doesn't help
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(outFile);
// outFile is a regular File outFile = new File("some/path/foo.xml");
transformer.transform(source, result);
The output produced is :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<stuffcontainer>
<stuff description="something" duration="240" title="abc">
<otherstuff />
</stuff>
</stuffcontainer>
Whereas I would want it (for more clarity) like :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<stuffcontainer>
<stuff description="something" duration="240" title="abc">
<otherstuff />
</stuff>
</stuffcontainer>
I was just wondering if there is a way of doing this, make it properly indented for the child elements.
Thank you in advance !
Happy Easter coding :-) !
If the Transformer implementation you're using is Xalan-J, then you should be able to use:
transformer.setOutputProperty(
"{http://xml.apache.org/xslt}indent-amount", "5");
See also: http://xml.apache.org/xalan-j/usagepatterns.html
import com.sun.org.apache.xml.internal.serializer.OutputPropertiesFactory
transformer.setOutputProperty(OutputPropertiesFactory.S_KEY_INDENT_AMOUNT, "4");
Document doc;
.....
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
transformer.transform(new DOMSource(doc), new StreamResult(new File("filename.xml")));
transformer.transform(new DOMSource(doc), new StreamResult(System.out));

Categories

Resources