Converting string to XMLDocument doesn't create text nodes - java

here's my situation.
I have a string containing XML data:
<tag>
<anotherTag> data </anotherTag>
</tag>
I take that string and I run it through this code to convert it to a Document:
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse(new InputSource(new StringReader(sXMLString)));
}
catch (Exception e) {
// Parser with specified options can't be built
ceLogger.logError("Unable to build a new XML Document from string provided:\t" + e.getMessage());
return null;
}
The resulting xml is almost perfect. Its missing the data however and looks like this:
<tag>
<anotherTag />
</tag>
How can I copy over the text when creating an XML Document and why is it removing the text in the first place?
Edit:
The actual problem ended up being something along the lines of this:
While parsing through the XML structure with my own function this line is there:
if (curChild.getNodeType()==Node.ELEMENT_NODE)
sResult.append(XMLToString((Element)children.item(i),attribute_mask));
But no such logic exists for TEXT nodes, so they are simply ignored.

Your code is correct. The only guess I can make is that you are outputting your code incorrectly. I've tested your code, and used the following method to output, and the XML was displayed correctly with the text node:
public static void outputXML(Document dom) throws TransformerException
{
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(dom);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);
}
The output was:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tag> <anotherTag> data </anotherTag>
</tag>

Related

Java DOM Transformer - XML creation doesn't replace apostrophe and quotes in the final xml

I'm trying to create an XML and return it as a response to the caller based on the input.
The transformer works as expected for most parts, but it doesn't convert apostrophe and quotes to their XML equivalent. Below is the code I'm using
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
// root elements
Document doc = docBuilder.newDocument();
Element rootElement = doc.createElement("template");
doc.appendChild(rootElement);
/* Adding attendant ID */
Element line = doc.createElement("line");
line.appendChild(doc.createTextNode("----&----<------>------'-----\"--------"));
Attr Attr1 = doc.createAttribute("Attr1");
Attr1.setValue("attribute value 1");
line.setAttributeNode(Attr1);
Attr Attr2 = doc.createAttribute("Attr2");
Attr2.setValue("attribute value 2");
line.setAttributeNode(Attr2);
rootElement.appendChild(line);
// write the content into xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
// Output to String
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
transformer.transform(source, result);
String strResult = writer.toString();
//return escapeXml(strResult);
System.out.println(strResult);
Resulting output
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<template>
<line Attr1="attribute value 1" Attr2="attribute value 2">----&----<------>------'-----"--------</line>
</template>
Expected Result
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<template>
<line Attr1="attribute value 1" Attr2="attribute value 2">----&----<------>------&apos;-----"--------</line>
</template>
Initially I thought could escape those character before sending it as input to transformer, but it replaced all the ampersand to their equivalent "&". If I replace the apostrophe or quotes after the final XML is created, it replaces attributes as well.
I'm thinking we could solve this in 2 ways
I could transform the & , < , > , ' , " before adding to node and transformer ignores it
Give explicit directions to transformer to convert ' , " them to their XML equivalent.
Currently I'm unaware of how to achieve these. Could someone help me on this or if a better solution to create a valid XML would hugely be appreciated.
Thanks.
Why do you want quotation marks and apostrophes to be escaped? XML doesn't require them to be escaped (except in attributes where they conflict with the attribute delimiters). The serializer knows what it's doing: trust it.

how to update XML data from java

I want to change the text content of some field in xml using java. I used setTextContent() for this, but the xml file is not getting updated.
here is my java code:
public static void main(String argv[]) {
DisclosureTranslation dt=new DisclosureTranslation();
String filepath="E:\\Repository\\17Nov_demo\\file.xml";
dt.getHashmap(filepath);
}
public void getHashmap(String filepath){
try {
DocumentBuilderFactory documentbuilderfactory=DocumentBuilderFactory.newInstance();
DocumentBuilder documentbuilder =documentbuilderfactory.newDocumentBuilder();
Document doc=documentbuilder.parse(filepath);
XPath xPath = XPathFactory.newInstance().newXPath();
Element element=doc.getDocumentElement();
NodeList nodelist=(NodeList)xPath.evaluate("/DOCUMENT/ishobject/ishfields/ishfield[#name='FHPIDISCLOSURELEVEL']",
doc.getDocumentElement(), XPathConstants.NODESET);
System.out.println(nodelist.item(0).getTextContent());
String val=nodelist.item(0).getTextContent();
//String val="111";
HashMap<String, String> hashmap=new HashMap<String,String>();
hashmap.put("47406819852170807613486806879990", "public");
hashmap.put("222"," HP Internal");
String value=hashmap.get(val);
nodelist.item(0).setTextContent(value);
System.out.println(nodelist.item(0).getTextContent());
}
the last line is displaying what i want. But its not getting reflected in the xml file. How am i suppose to update my xml file?
Thanks in advance! :)
Once you have updated the element from parsing the xml from the file path, update them to the same file using below method.
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult streamResult = new StreamResult(new File(filePath));
transformer.transform(source, streamResult);
Saving the xml back to the file can also be achieved by using "Transformer API".

Problems removing nodes from XML file using java

currently I'm required to remove a specific node and its child in an XML file
, however I always encountered null pointer exception whenever I'm trying to remove the nodes. The "position" parameter would be the # of node to remove. e.g position 3 should remove reservation id(04113049)and everything under it.
public void removeReservation(int position){
try{
File file = new File("reservations.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(file);
Element element = (Element)doc.getElementsByTagName("reservation").item(position);
element.getParentNode().removeChild(element);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("reservations.xml"));
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(source, result);
}
catch(Exception e){
e.printStackTrace();
}
}
Here are the contents of the xml file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<data>
<reservation_list>
<reservation>
<resID>01014664</resID>
<roomNo>0101</roomNo>
<roomType>VIPSuite</roomType>
<noOfGuest>3</noOfGuest>
<bedType>Master</bedType>
<smoking>Y</smoking>
<startDate>121313</startDate>
<endDate>121316</endDate>
<wifi>Y</wifi>
<roomView>Y</roomView>
<availability>Reserved</availability>
<name>Johnny depp</name>
<address>NTU Hall 17 #01-111</address>
<country>Singapore</country>
<gender>Male</gender>
<nationality>Singaporean</nationality>
<contact>92003239</contact>
<creditCardNo>1234567812345678</creditCardNo>
<creditCardCSV>432</creditCardCSV>
<creditCardExpDate>11/16</creditCardExpDate>
<identity>U0000000I</identity>
</reservation>
<reservation>
<resID>11025652</resID>
<roomNo>1102</roomNo>
<roomType>Double</roomType>
<noOfGuest>3</noOfGuest>
<bedType>Master</bedType>
<smoking>Y</smoking>
<startDate>1212</startDate>
<endDate>1213</endDate>
<wifi>Y</wifi>
<roomView>Y</roomView>
<availability>Reserved</availability>
<name>Thomas</name>
<address>Mountbatten #2-12 Garden ave</address>
<country>Singapore</country>
<gender>Male</gender>
<nationality>Singaporean</nationality>
<contact>93482032</contact>
<creditCardNo>1234567812345678</creditCardNo>
<creditCardCSV>588</creditCardCSV>
<creditCardExpDate>3/16</creditCardExpDate>
<identity>U1234567I</identity>
</reservation>
<reservation>
<resID>04113049</resID>
<roomNo>0411</roomNo>
<roomType>VIPSuite</roomType>
<noOfGuest>7</noOfGuest>
<bedType>Master</bedType>
<smoking>Y</smoking>
<startDate>121112</startDate>
<endDate>232333</endDate>
<wifi>Y</wifi>
<roomView>Y</roomView>
<availability>Reserved</availability>
<name>elaine</name>
<address>punggol</address>
<country>Singapore</country>
<gender>Female</gender>
<nationality>Singaporean</nationality>
<contact>12345672</contact>
<creditCardNo>1234123412341234</creditCardNo>
<creditCardCSV>123</creditCardCSV>
<creditCardExpDate>1212</creditCardExpDate>
<identity>S96777777777F</identity>
</reservation>
</reservation_list>
</data>
First, filtering using item() is zero-based ( starts from index 0 ), there is no item(3) in your file.
Second, you should always check that you are able to find a reservation for that position before you are trying to remove it. In your case, I think you're trying to do .getParentNode() on a null element which is why you're seeing the NullPointer.
Element element = (Element)doc.getElementsByTagName("reservation").item(position);
if ( null != element) {
element.getParentNode().removeChild(element);
//etc
}

Preserve newline in xml while using builder.parse method & Transformer

The objective is to read from a xml file and write to a new xml file while preserving newlines. We need the Document object to perform other xml tasks.
Say source.xml looks like this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Code><![CDATA[code line1
code line 2
code line 3
code line 4]]></Code>
Now the destination should look the same with the newlines in the code element. But instead it ignores the newlines and makes it one line.
For writing, I am using the method below:
public static void writeFile(Document xml, File writeTo)
{
try
{
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
DOMSource source = new DOMSource(xml);
StreamResult result = new StreamResult(writeTo);
transformer.transform(source, result);
}
catch(TransformerException e)
{
System.out.println("Couldn't write file " + writeTo);
e.printStackTrace();
}
}
The Document xml is obtained using Parse(File) method in DocumentBuilder. Roughly in the lines of:
File file; // a list of files is recursively obtained from a given folder.
DocumentBuilderFactory documentBuilderfactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = documentBuilderfactory.newDocumentBuilder();
Document xml = builder.parse(file);
The builder.parse seems to be losing the newlines in the CDATA of Code element.
How do we preserve the newlines?
I am new to Java APIs.
When I put your snippets together I get this program:
public class TestNewLine {
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, TransformerException {
DocumentBuilderFactory documentBuilderfactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = documentBuilderfactory.newDocumentBuilder();
Document xml = builder.parse(TestNewLine.class.getResourceAsStream("data.xml"));
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
DOMSource source = new DOMSource(xml);
StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);
}
}
and it prints out:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Code><![CDATA[code line1
code line 2
code line 3
code line 4]]></Code>
As far as I understood, the newline is preserved already. What output did you expect?

Parsing with XPath a xml document. Why add a <xml tag as a header> in the result?

I searched on google first and I found many result about how to parse with xpath a xml document. I have parse it but a want to convert a NODELIST in String and I have created a method for it:
private String processResult(Document responseDocument) throws XPathExpressionException, TransformerException {
NodeList soaphead = responseDocument.getElementsByTagName("xmlTagToTrasform");
StringWriter sw = new StringWriter();
Transformer serializer = TransformerFactory.newInstance().newTransformer();
serializer.transform(new DOMSource(soaphead.item(0)), new StreamResult(sw));
String result = sw.toString();
return result;
}
This method works perfectly but the transformer adds an <?xml version="1.0" encoding="UTF-8"?> in the header of the result, and I don't want that. This is the result of the method:
<?xml version="1.0" encoding="UTF-8"?>
<xmlTagToTrasform>
<xmlTagToTrasform2>
.
.
.
.
</xmlTagToTrasform2>
</xmlTagToTrasform>
You can configure the transformer not to output the XML declaration, before you call transform:
serializer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
XML is a markup language and every xml document has this line on the top to specify the version and the encoding-type. It is madatory to have this.

Categories

Resources