I have a method to create and record to xml file. It produces corrupted result. My turkish characters writing as hexadecimal expressions. While i'm using UTF-8, i couldn't solve the problem. By the way i checked both with Sublime and Notepad++ editors.
public boolean add(BatFile batFile) throws Exception {
File inputFile = new File(fileLocation);
DocumentBuilderFactory docFactory = DocumentBuilderFactory
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(inputFile);
Element rootElement = doc.getDocumentElement();
Element batFileElement = doc.createElement("BatFile");
Element batJobName = doc.createElement("Name");
Element batFileBriefDesc = doc.createElement("BriefDesc");
Element batFileDesc = doc.createElement("Desc");
Element batFileName = doc.createElement("FileName");
Element batCommandArgs = doc.createElement("CommandArgs");
for (int k = 0; k < batFile.getCommandArgs().size(); k++) {
Element commandArg = doc.createElement("CommandArg");
// commandArg.setAttribute("ID", String.valueOf(k));
Element batCreationTime = doc.createElement("CreationTime");
Element batSchedulerPattern = doc.createElement("SchedulerPattern");
Element batTaskID = doc.createElement("TaskID");
if (batFile.getTaskID() != null) {
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
DOMSource domSource = new DOMSource(doc);
StreamResult result = new StreamResult(new FileWriter(inputFile));
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.transform(domSource, result);
return true;
When i test it with those codes below:
public void testAddingTask() throws Exception {
IBAO testBao = XMLBAO.getInstance();
BatFile testBatFile = new BatFile();
It produces me this result:

You're writing to a character stream and not letting the API control which encoding the data is written as. FileWriter uses the default platform encoding which might not be UTF-8:
The constructors of this class assume that the default character encoding and the default byte-buffer size are acceptable.
Use a FileOutputStream with the StreamResult (in a try-with-resources block.)
You might also be having issues due to Java source file encodings. Consider using Unicode escapes instead of literals. That is, "\u015E" instead of "Ş".


Parsing HTML content from XML file

<xbrli:xbrl xmlns:aoi="http://www.aointl.com/20160331" xmlns:country="http://xbrl.sec.gov/country/2016-01-31" xmlns:currency="http://xbrl.sec.gov/currency/2016-01-31" xmlns:dei="http://xbrl.sec.gov/dei/2014-01-31" xmlns:exch="http://xbrl.sec.gov/exch/2016-01-31" xmlns:invest="http://xbrl.sec.gov/invest/2013-01-31" xmlns:iso4217="http://www.xbrl.org/2003/iso4217" xmlns:link="http://www.xbrl.org/2003/linkbase" xmlns:naics="http://xbrl.sec.gov/naics/2011-01-31" xmlns:nonnum="http://www.xbrl.org/dtr/type/non-numeric" xmlns:num="http://www.xbrl.org/dtr/type/numeric" xmlns:ref="http://www.xbrl.org/2006/ref" xmlns:sic="http://xbrl.sec.gov/sic/2011-01-31" xmlns:stpr="http://xbrl.sec.gov/stpr/2011-01-31" xmlns:us-gaap="http://fasb.org/us-gaap/2016-01-31" xmlns:us-roles="http://fasb.org/us-roles/2016-01-31" xmlns:us-types="http://fasb.org/us-types/2016-01-31" xmlns:utreg="http://www.xbrl.org/2009/utr" xmlns:xbrldi="http://xbrl.org/2006/xbrldi" xmlns:xbrldt="http://xbrl.org/2005/xbrldt" xmlns:xbrli="http://www.xbrl.org/2003/instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<link:schemaRef xlink:href="aoi-20160331.xsd" xlink:type="simple"/>
<xbrli:context id="FD2016Q4YTD">
<xbrli:identifier scheme="http://www.sec.gov/CIK">0000939930</xbrli:identifier>
<aoi:OtherIncomeAndExpensePolicyTextBlock contextRef="FD2016Q4YTD" id="Fact-F51C7616E17E5B8B0B770D410BBF5A3E">
<div style="font-family:Times New Roman;font-size:10pt;"><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;font-weight:bold;">Other Income (Expense)</font></div><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;"></font></div></div>
This is My XML[XBRL], i need to parse this. This xml is my input and i don't know whether its a valid or not but in need output like this :
<div style="font-family:Times New Roman;font-size:10pt;"><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;font-weight:bold;">Other Income (Expense)</font></div><div style="line-height:120%;text-align:justify;font-size:10pt;"><font style="font-family:inherit;font-size:10pt;"></font></div></div>
Please someone share me the knowledge for this problem i am facing from last two weeks.
this is the code i am using
File fXmlFile = new File("/home/devteam-user1/Desktop/ky/UnitTesting.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
XPath xPath = XPathFactory.newInstance().newXPath();
final String DIV_UNDER_ROOT = "/*/aoi";
NodeList divList = (NodeList)xPath.compile(DIV_UNDER_ROOT)
.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < divList.getLength() ; i++) { // just in case there is more than one
Node divNode = divList.item(i);
//nodeToString method below
private static String nodeToString(Node node) throws Exception
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
StreamResult result = new StreamResult(new StringWriter());
transformer.transform(new DOMSource(node), result);
return result.getWriter().toString();
this works well for me
public static void main(String[] args) throws IOException {
FileInputStream fis = new FileInputStream("yourfile.xml");
Document doc = Jsoup.parse(Utils.streamToString(fis));
Your main issue lies with
final String DIV_UNDER_ROOT = "/*/aoi";
Which is an XPath expression that matches "any node 2 levels under the root, which has a local name of aoi and no namespace". This is not what you want.
You want to match any contents of a node that is two levels deep, whose namespace is aliased by "aoi" (which means it belongs to the "http://www.aointl.com/20160331" namespace), and whose local name is "OtherIncomeAndExpensePolicyTextBlock".
Matching namespaces in XPath in Java is quiet cumbersome (see XPath with namespace in Java and How to query XML using namespaces in Java with XPath?), but long story short, you could try this way instead :
final String DIV_UNDER_ROOT = "//*[local-name()='OtherIncomeAndExpensePolicyTextBlock' and namespace-uri()='http://www.aointl.com/20160331']/*";
This will only work if your DocumentBuilderFactory is made namespace aware, so you should make sure by configuring it like so above :
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();

Pretty print XML in java 8

I have an XML file stored as a DOM Document and I would like to pretty print it to the console, preferably without using an external library. I am aware that this question has been asked multiple times on this site, however none of the previous answers have worked for me. I am using java 8, so perhaps this is where my code differs from previous questions? I have also tried to set the transformer manually using code found from the web, however this just caused a not found error.
Here is my code which currently just outputs each xml element on a new line to the left of the console.
import java.io.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class Test {
public Test(){
try {
//java.lang.System.setProperty("javax.xml.transform.TransformerFactory", "org.apache.xalan.xsltc.trax.TransformerFactoryImpl");
DocumentBuilderFactory dbFactory;
DocumentBuilder dBuilder;
Document original = null;
try {
dbFactory = DocumentBuilderFactory.newInstance();
dBuilder = dbFactory.newDocumentBuilder();
original = dBuilder.parse(new InputSource(new InputStreamReader(new FileInputStream("xml Store - Copy.xml"))));
} catch (SAXException | IOException | ParserConfigurationException e) {
StringWriter stringWriter = new StringWriter();
StreamResult xmlOutput = new StreamResult(stringWriter);
TransformerFactory tf = TransformerFactory.newInstance();
//tf.setAttribute("indent-number", 2);
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.transform(new DOMSource(original), xmlOutput);
} catch (Exception ex) {
throw new RuntimeException("Error converting to String", ex);
public static void main(String[] args){
new Test();
In reply to Espinosa's comment, here is a solution when "the original xml is not already (partially) indented or contain new lines".
Excerpt from the article (see References below) inspiring this solution:
Based on the DOM specification, whitespaces outside the tags are perfectly valid and they are properly preserved. To remove them, we can use XPath’s normalize-space to locate all the whitespace nodes and remove them first.
Java Code
public static String toPrettyString(String xml, int indent) {
try {
// Turn xml string into a document
Document document = DocumentBuilderFactory.newInstance()
.parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));
// Remove whitespaces outside tags
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
for (int i = 0; i < nodeList.getLength(); ++i) {
Node node = nodeList.item(i);
// Setup pretty print options
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setAttribute("indent-number", indent);
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
// Return pretty print xml string
StringWriter stringWriter = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(stringWriter));
return stringWriter.toString();
} catch (Exception e) {
throw new RuntimeException(e);
Sample usage
String xml = "<root>" + //
"\n " + //
"\n<name>Coco Puff</name>" + //
"\n <total>10</total> </root>";
System.out.println(toPrettyString(xml, 4));
<name>Coco Puff</name>
Java: Properly Indenting XML String published on MyShittyCode
Save new XML node to file
I guess that the problem is related to blank text nodes (i.e. text nodes with only whitespaces) in the original file. You should try to programmatically remove them just after the parsing, using the following code. If you don't remove them, the Transformer is going to preserve them.
XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//text()[normalize-space(.) = '']");
NodeList blankTextNodes = (NodeList) xpath.evaluate(original, XPathConstants.NODESET);
for (int i = 0; i < blankTextNodes.getLength(); i++) {
This works on Java 8:
public static void main (String[] args) throws Exception {
String xmlString = "<hello><from>ME</from></hello>";
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(new InputSource(new StringReader(xmlString)));
pretty(document, System.out, 2);
private static void pretty(Document document, OutputStream outputStream, int indent) throws Exception {
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
if (indent > 0) {
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", Integer.toString(indent));
Result result = new StreamResult(outputStream);
Source source = new DOMSource(document);
transformer.transform(source, result);
I've written a simple class for for removing whitespace in documents - supports command-line and does not use DOM / XPath.
Edit: Come to think of it, the project also contains a pretty-printer which handles existing whitespace:
PrettyPrinter prettyPrinter = PrettyPrinterBuilder.newPrettyPrinter().ignoreWhitespace().build();
Underscore-java has static method U.formatXml(string). I am the maintainer of the project. Live example
import com.github.underscore.U;
public class MyClass {
public static void main(String args[]) {
String xml = "<root>" + //
"\n " + //
"\n<name>Coco Puff</name>" + //
"\n <total>10</total> </root>";
<name>Coco Puff</name>
I didn't like any of the common XML formatting solutions because they all remove more than 1 consecutive new line character (for some reason, removing spaces/tabs and removing new line characters are inseparable...). Here's my solution, which was actually made for XHTML but should do the job with XML as well:
public String GenerateTabs(int tabLevel) {
char[] tabs = new char[tabLevel * 2];
Arrays.fill(tabs, ' ');
//char[] tabs = new char[tabLevel];
//Arrays.fill(tabs, '\t');
return new String(tabs);
public String FormatXHTMLCode(String code) {
// Split on new lines.
String[] splitLines = code.split("\\n", 0);
int tabLevel = 0;
// Go through each line.
for (int lineNum = 0; lineNum < splitLines.length; ++lineNum) {
String currentLine = splitLines[lineNum];
if (currentLine.trim().isEmpty()) {
splitLines[lineNum] = "";
} else if (currentLine.matches(".*<[^/!][^<>]+?(?<!/)>?")) {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
} else if (currentLine.matches(".*</[^<>]+?>")) {
if (tabLevel < 0) {
tabLevel = 0;
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
} else if (currentLine.matches("[^<>]*?/>")) {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
if (tabLevel < 0) {
tabLevel = 0;
} else {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
return String.join("\n", splitLines);
It makes one assumption: that there are no <> characters except for those that comprise the XML/XHTML tags.
Create xml file :
new FileInputStream("xml Store - Copy.xml") ;// result xml file format incorrect !
so that, when parse the content of the given input source as an XML document
and return a new DOM object.
Document original = null;
original.parse("data.xml");//input source as an XML document

Prevent transformer.transform( source, result ) from escaping special character

I'm updating node and text content of the xml using DOM parser. To save that DOM parser I'm using transformer.transform method.
Below is the sample code.
String xmlText = "<uc>abcd><name>mine</name>efgh\netg<tag>sd</tag></uc>";
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
InputSource inStream = new InputSource();
inStream.setCharacterStream(new StringReader(xmlText));
Document document = documentBuilder.parse(inStream);
Node node = document.getDocumentElement();
NodeList childNodes = node.getChildNodes();
for(int i=0; i<childNodes.getLength(); i++) {
if(childNodes.item(i).getNodeType() == Node.TEXT_NODE) {
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "US-ASCII");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource( document );
OutputStream xml = new ByteArrayOutputStream();
StreamResult result = new StreamResult( xml );
transformer.transform( source, result );
String formattedXml = xml.toString();
Since my updated document is having text content like ">", transformer.transform method is changing it to &g t;
Is there a way to get the output without escaping special characters.
I can't use other parser because of some project constraints.
I can't use StringEscapeUtils.unescapeXml(). The reason is xml can have &g t;. If i use this utility method, &g t; which was originally present in the xml will also get changed.
So i want a mechanism which will not escape any special character.
The transformer you create with
Transformer transformer = tFactory.newTransformer();
is initialized with a default stylesheed that implements the identity transformation. That means it will simply serialize your DOM to a well-formed XML document. Output escaping is automatically applied where necessary.
If you want better control over the output, and possibly generate something that does not adhere to XML document structures, you can use a custom stylesheet that switches the output method to text. This way you control more of the structure but can do more mistakes in the XML area.
More information at

Editing xml content in java and passing it as string, using node preferably

I've a xml document, which will be used as a template
<?xml version="1.0" encoding="UTF-8" standalone="no"?><entry xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"><content type="application/xml"><m:properties><d:AccountEnabled>true</d:AccountEnabled><d:DisplayName>SampleAppTestj5</d:DisplayName><d:MailNickname>saTestj5</d:MailNickname><d:Password>Qwerty1234</d:Password><d:UserPrincipalName>saTestj5#identropy.us</d:UserPrincipalName></m:properties></content></entry>
I'm calling it in java using this code where payLoadXML.xml has the above content.
"InputStream is = getClass().getClassLoader().getResourceAsStream("/payLoadXML.xml");"
Now I'm trying to edit the tag values for example changing the from "saTestj5" to "saTestj6" and then converting this entire xml and storing it in xml. Can anyone tell me how can I achieve this? I was told this can be done by using "Node" is it possible?
Use jaxb or sax parsers convert into object by using getter method and change the object and convert back to xml
try this
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = null;
docBuilder = docFactory.newDocumentBuilder();
Document doc = null;
InputStream is = getClass().getClassLoader().getResourceAsStream("/payLoadXML.xml");
doc = docBuilder.parse(is);
Node staff = doc.getElementsByTagName("m:properties").item(0);
Text givenNameValue = doc.createTextNode("abc");
Element givenName = doc.createElement("d:GivenName");
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = null;
transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
transformer.transform(source, result);

How to set UTF-16 encoding format for Xml?

I am in need to create xml as a string to pass to server. I have managed to convert the data into xml but the encoding format set to utf-8 as default. What i need is i want to set it as utf-16 format. But i haven't got any idea of setting it.
private void XmlCreation(int size,List<DataItem> item) throws ParserConfigurationException, TransformerException
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
Element rootElement = document.createElement("ArrayOfDataItem");
for (DataItem in: item)
Element subroot = document.createElement("DataItem");
Element em = document.createElement(in.getKey());
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
java.io.StringWriter sw = new java.io.StringWriter();
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);
String xml = sw.toString();
Thanks guys
I haven't tested, but that should do the trick:
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-16");
This article might help you. Basically, you call setOutputProperty with OutputKeys.ENCODING as key and the desired encoding ("UTF-16") as value.

