DocumentBuilderFactory removing namespaces, even with #setNamespaceAware - java

I'm trying to convert an XML string to a Document, do some DOM manipulation, then convert it back to a string. But I'm having trouble getting the namespaces to be retained after the tranformation. I'd like to keep the namespace for child elements which have the same namespace as the parent.
I'm running the following simple code to test:
final String test = "<element xmlns:xc=\"urn:myNamespace\">\n"
+ "\t<child xmlns:xc=\"urn:myNamespace\">\n"
+ "\t\t<attribute>value</attribute>\n"
+ "\t</child>\n"
+ "</element>\n";
System.out.println(test);
final DocumentBuilderFactory documentBuilderFactory = mentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
final Document testDoc = documentBuilderFactory
.newDocumentBuilder()
.parse(new ByteArrayInputStream(test.getBytes(StandardCharsets.UTF_8)));
final StringWriter stringWriter = new StringWriter();
final TransformerFactory transformerFactory = TransformerFactory.newInstance();
final Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, StandardCharsets.UTF_8.name());
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
transformer.transform(new DOMSource(testDoc), new StreamResult(stringWriter));
System.out.println(stringWriter.toString());
The output is:
<element xmlns:xc="urn:myNamespace">
<child xmlns:xc="urn:myNamespace">
<attribute>value</attribute>
</child>
</element>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<element xmlns:xc="urn:myNamespace">
<child>
<attribute>value</attribute>
</child>
</element>
I'm using Java 7. Is there any way to keep the namespace in the child element?

Related

How to create xml file using Java?

I am creating a xml file using Java Transformer.The root node has syntax like this:
<AUTO-RESPONSE-DOCUMENT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://someurl">.
I am creating the root node like this:
Document doc = docBuilder.newDocument();
Element ele = doc.createElement("AUTO-RESPONSE-DOCUMENT");
doc.appendChild(ele);
How should i put the above urls in front of AUTO-RESPONSE-DOCUMENT node?
If you mean the namespace attributes: You can set them like all other atributes:
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Element ele = doc.createElement("AUTO-RESPONSE-DOCUMENT");
//Add namespace attibutes
ele.setAttribute("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
ele.setAttribute("xmlns:xsd", "http://www.w3.org/2001/XMLSchema");
ele.setAttribute("xmlns", "http://someurl");
doc.appendChild(ele);
Put through this Document-To-Text code
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);
It creates that output:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<AUTO-RESPONSE-DOCUMENT xmlns="http://someurl"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>
/**
* #param args
* #throws ParserConfigurationException
*/
public static void main(String[] args) throws Exception {
DocumentBuilder docBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = docBuilder.newDocument();
Element ele = doc.createElement("AUTO-RESPONSE-DOCUMENT");
doc.appendChild(ele);
ele.setAttribute("xmlns", "http://someurl");
ele.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:xsd", "http://www.w3.org/2001/XMLSchema");
ele.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
TransformerFactory.newInstance().newTransformer().transform(new DOMSource(doc), new StreamResult(System.out));
}
Note the namespace for "xmlns" pefix must be exactly as shown.

Preserve newline in xml while using builder.parse method & Transformer

The objective is to read from a xml file and write to a new xml file while preserving newlines. We need the Document object to perform other xml tasks.
Say source.xml looks like this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Code><![CDATA[code line1
code line 2
code line 3
code line 4]]></Code>
Now the destination should look the same with the newlines in the code element. But instead it ignores the newlines and makes it one line.
For writing, I am using the method below:
public static void writeFile(Document xml, File writeTo)
{
try
{
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
DOMSource source = new DOMSource(xml);
StreamResult result = new StreamResult(writeTo);
transformer.transform(source, result);
}
catch(TransformerException e)
{
System.out.println("Couldn't write file " + writeTo);
e.printStackTrace();
}
}
The Document xml is obtained using Parse(File) method in DocumentBuilder. Roughly in the lines of:
File file; // a list of files is recursively obtained from a given folder.
DocumentBuilderFactory documentBuilderfactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = documentBuilderfactory.newDocumentBuilder();
Document xml = builder.parse(file);
The builder.parse seems to be losing the newlines in the CDATA of Code element.
How do we preserve the newlines?
I am new to Java APIs.
When I put your snippets together I get this program:
public class TestNewLine {
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, TransformerException {
DocumentBuilderFactory documentBuilderfactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = documentBuilderfactory.newDocumentBuilder();
Document xml = builder.parse(TestNewLine.class.getResourceAsStream("data.xml"));
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
DOMSource source = new DOMSource(xml);
StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);
}
}
and it prints out:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Code><![CDATA[code line1
code line 2
code line 3
code line 4]]></Code>
As far as I understood, the newline is preserved already. What output did you expect?

Dynamically generating XML Schema

I am trying to dynamically generate XML schema using Xerces-J and getting the following error, appreciate any help regarding it.
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
dbfac.setNamespaceAware(true);
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();
Element schema = doc.createElement("xs:schema");
schema.setAttribute("xmlns:xs", "http://www.w3.org/2001/XMLSchema");
doc.appendChild(schema);
Element e = doc.createElement("xs:element");
e.setAttribute("name", "test");
e.setAttribute("type", "xs:string");
schema.appendChild(e);
TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
//create string from xml tree
StringWriter sw = new StringWriter();
StreamResult result = new StreamResult(sw);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);
String xmlString = sw.toString();
System.out.println(xmlString);
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema1 = schemaFactory.newSchema(source);
Output is
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="test" type="xs:string"/>
</xs:schema>
org.xml.sax.SAXParseException: s4s-elt-schema-ns: The namespace of element 'xs:schema' must be from
the schema namespace, 'http://www.w3.org/2001/XMLSchema'.
When building a DOM, you don't specify namespaces as attributes. Instead, use the version of createElement() that takes two parameters: the first is the namespace URI, the second is the element's qualified name.
Note also that the prefix of a qualified name will automatically be matched to the namespace URI. If you want, you could eliminate the prefix altogether, and the serializer will do the right thing (either creating an xmlns attribute without prefix, or generating a prefix).
I had the similar problem and found the Apache Commons XMLSchema

Add xml-stylesheet and get standalone = yes

I added the solution to the code below.
The code at the bottom is what I have. I removed the creation of all tags.
At the top in the xml file I get.<?xml version="1.0" encoding="UTF-8" standalone="no"?> Note that standalone is no, even thou I have it set to yes.
The first question: How do I get standalone = yes?
I would like to add <?xml-stylesheet type="text/xsl" href="my.stylesheet.xsl"?> at line two in the xml file.
Second question: How do I do that?
Some useful links? Anything?
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();
doc.setXmlStandalone(true);
ProcessingInstruction pi = doc.createProcessingInstruction("xml-stylesheet", "type=\"text/xsl\" href=\"my.stylesheet.xsl\"");
Element root = doc.createElement("root-element");
doc.appendChild(root);
doc.insertBefore(pi, root);
<cut>
TransformerFactory transfac = TransformerFactory.newInstance();
transfac.setAttribute("indent-number", new Integer(2));
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
trans.setOutputProperty(OutputKeys.STANDALONE, "yes");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
trans.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "name");
FileOutputStream fout = new FileOutputStream(filepath);
BufferedOutputStream bout= new BufferedOutputStream(fout);
trans.transform(new DOMSource(doc), new StreamResult(new OutputStreamWriter(bout, "utf-8")));
I added
doc.setXmlStandalone(true);
ProcessingInstruction pi = doc.createProcessingInstruction("xml-stylesheet", "type=\"text/xsl\" href=\"my.stylesheet.xsl\"");`
before the cut and
doc.insertBefore(pi, root);
right after the root element was appended to the doc.
in my code, I wrote :
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
document.setXmlStandalone(true);
TransformerFactory tfactory = TransformerFactory.newInstance();
Transformer transformer = tfactory.newTransformer();
transformer.setOutputProperty(OutputKeys.STANDALONE, "yes");
output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

Java XML Output - proper indenting for child items

I'd like to serialize some simple data model into xml, I've been using the standard java.org.w3c -related code (see below), the indentation is better than no "OutputKeys.INDENT", yet there is one little thing that remains - proper indentation for child elements.
I know that this has been asked before on stackoverflow , yet that configuration did not work for me, this is the code I'm using :
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.newDocument();
doc = addItemsToDocument(doc);
// The addItemsToDocument method adds childElements to the document.
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setAttribute("indent-number", new Integer(4));
// switching to setAttribute("indent-number", 4); doesn't help
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(outFile);
// outFile is a regular File outFile = new File("some/path/foo.xml");
transformer.transform(source, result);
The output produced is :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<stuffcontainer>
<stuff description="something" duration="240" title="abc">
<otherstuff />
</stuff>
</stuffcontainer>
Whereas I would want it (for more clarity) like :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<stuffcontainer>
<stuff description="something" duration="240" title="abc">
<otherstuff />
</stuff>
</stuffcontainer>
I was just wondering if there is a way of doing this, make it properly indented for the child elements.
Thank you in advance !
Happy Easter coding :-) !
If the Transformer implementation you're using is Xalan-J, then you should be able to use:
transformer.setOutputProperty(
"{http://xml.apache.org/xslt}indent-amount", "5");
See also: http://xml.apache.org/xalan-j/usagepatterns.html
import com.sun.org.apache.xml.internal.serializer.OutputPropertiesFactory
transformer.setOutputProperty(OutputPropertiesFactory.S_KEY_INDENT_AMOUNT, "4");
Document doc;
.....
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
transformer.transform(new DOMSource(doc), new StreamResult(new File("filename.xml")));
transformer.transform(new DOMSource(doc), new StreamResult(System.out));

Categories

Resources