Generate/get xpath from XML node java

Generate/get xpath from XML node java - java

I'm interested in advice/pseudocode code/explanation rather than actual implementation.
I'd like to go through XML document, all of its nodes
Check the node for attribute existence
Case if node doesn't have attribute, get/generate String with value of its xpath
Case if node does have attributes, iterate through attribute list and create xpath for each attribute including the node as well.
Edit
My reason for doing this is: I'm writing automated tests in Jmeter, so for every request I need to verify that request actually did its job so I'm asserting results by getting nodes values with Xpath.
When the request is small it's not a problem to create asserts by hand, but for larger ones it's really a pain.
I'm looking for Java approach.
Goal
My goal is to achieve following from this example XML file :
<root>
<elemA>one</elemA>
<elemA attribute1='first' attribute2='second'>two</elemA>
<elemB>three</elemB>
<elemA>four</elemA>
<elemC>
<elemB>five</elemB>
</elemC>
</root>
to produce the following :
//root[1]/elemA[1]='one'
//root[1]/elemA[2]='two'
//root[1]/elemA[2][#attribute1='first']
//root[1]/elemA[2][#attribute2='second']
//root[1]/elemB[1]='three'
//root[1]/elemA[3]='four'
//root[1]/elemC[1]/elemB[1]='five'
Explained :
If node value/text is not null/zero, get xpath , add = 'nodevalue' for assertion purpose
If node has attributes create assert for them too
Update
I found this example, it doesn't produce the correct results, but I'm looking something like this:
http://www.coderanch.com/how-to/java/SAXCreateXPath

Update:
#c0mrade has updated his question. Here is a solution to it:
This XSLT transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vApos">'</xsl:variable>
<xsl:template match="*[#* or not(*)] ">
<xsl:if test="not(*)">
<xsl:apply-templates select="ancestor-or-self::*" mode="path"/>
<xsl:value-of select="concat('=',$vApos,.,$vApos)"/>
<xsl:text>
</xsl:text>
</xsl:if>
<xsl:apply-templates select="#*|*"/>
</xsl:template>
<xsl:template match="*" mode="path">
<xsl:value-of select="concat('/',name())"/>
<xsl:variable name="vnumPrecSiblings" select=
"count(preceding-sibling::*[name()=name(current())])"/>
<xsl:if test="$vnumPrecSiblings">
<xsl:value-of select="concat('[', $vnumPrecSiblings +1, ']')"/>
</xsl:if>
</xsl:template>
<xsl:template match="#*">
<xsl:apply-templates select="../ancestor-or-self::*" mode="path"/>
<xsl:value-of select="concat('[#',name(), '=',$vApos,.,$vApos,']')"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<root>
<elemA>one</elemA>
<elemA attribute1='first' attribute2='second'>two</elemA>
<elemB>three</elemB>
<elemA>four</elemA>
<elemC>
<elemB>five</elemB>
</elemC>
</root>
produces exactly the wanted, correct result:
/root/elemA='one'
/root/elemA[2]='two'
/root/elemA[2][#attribute1='first']
/root/elemA[2][#attribute2='second']
/root/elemB='three'
/root/elemA[3]='four'
/root/elemC/elemB='five'
When applied to the newly-provided document by #c0mrade:
<root>
<elemX serial="kefw90234kf2esda9231">
<id>89734</id>
</elemX>
</root>
again the correct result is produced:
/root/elemX[#serial='kefw90234kf2esda9231']
/root/elemX/id='89734'
Explanation:
Only elements that have no children elements, or have attributes are matched and processed.
For any such element, if it doesn't have children-elements all of its ancestor-or self elements are processed in a specific mode, named 'path'. Then the "='theValue'" part is output and then a NL character.
All attributes of the matched element are then processed.
Then finally, templates are applied to all children-elements.
Processing an element in the 'path' mode is simple: A / character and the name of the element are output. Then, if there are preceding siblings with the same name, a "[numPrecSiblings+1]` part is output.
Processing of attributes is simple: First all ancestor-or-self:: elements of its parent are processed in 'path' mode, then the [attrName=attrValue] part is output, followed by a NL character.
Do note:
Names that are in a namespace are displayed without any problem and in their initial readable form.
To aid readability, an index of [1] is never displayed.
Below is my initial answer (may be ignored)
Here is a pure XSLT 1.0 solution:
Below is a sample xml document and a stylesheet that takes a node-set parameter and produces one valid XPath expression for every member-node.
stylesheet (buildPath.xsl):
<xsl:stylesheet version='1.0'
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
>
<xsl:output method="text"/>
<xsl:variable name="theParmNodes" select="//namespace::*[local-name() =
'myNamespace']"/>
<xsl:template match="/">
<xsl:variable name="theResult">
<xsl:for-each select="$theParmNodes">
<xsl:variable name="theNode" select="."/>
<xsl:for-each select="$theNode |
$theNode/ancestor-or-self::node()[..]">
<xsl:element name="slash">/</xsl:element>
<xsl:choose>
<xsl:when test="self::*">
<xsl:element name="nodeName">
<xsl:value-of select="name()"/>
<xsl:variable name="thisPosition"
select="count(preceding-sibling::*[name(current()) =
name()])"/>
<xsl:variable name="numFollowing"
select="count(following-sibling::*[name(current()) =
name()])"/>
<xsl:if test="$thisPosition + $numFollowing > 0">
<xsl:value-of select="concat('[', $thisPosition +
1, ']')"/>
</xsl:if>
</xsl:element>
</xsl:when>
<xsl:otherwise> <!-- This node is not an element -->
<xsl:choose>
<xsl:when test="count(. | ../#*) = count(../#*)">
<!-- Attribute -->
<xsl:element name="nodeName">
<xsl:value-of select="concat('#',name())"/>
</xsl:element>
</xsl:when>
<xsl:when test="self::text()"> <!-- Text -->
<xsl:element name="nodeName">
<xsl:value-of select="'text()'"/>
<xsl:variable name="thisPosition"
select="count(preceding-sibling::text())"/>
<xsl:variable name="numFollowing"
select="count(following-sibling::text())"/>
<xsl:if test="$thisPosition + $numFollowing > 0">
<xsl:value-of select="concat('[', $thisPosition +
1, ']')"/>
</xsl:if>
</xsl:element>
</xsl:when>
<xsl:when test="self::processing-instruction()">
<!-- Processing Instruction -->
<xsl:element name="nodeName">
<xsl:value-of select="'processing-instruction()'"/>
<xsl:variable name="thisPosition"
select="count(preceding-sibling::processing-instruction())"/>
<xsl:variable name="numFollowing"
select="count(following-sibling::processing-instruction())"/>
<xsl:if test="$thisPosition + $numFollowing > 0">
<xsl:value-of select="concat('[', $thisPosition +
1, ']')"/>
</xsl:if>
</xsl:element>
</xsl:when>
<xsl:when test="self::comment()"> <!-- Comment -->
<xsl:element name="nodeName">
<xsl:value-of select="'comment()'"/>
<xsl:variable name="thisPosition"
select="count(preceding-sibling::comment())"/>
<xsl:variable name="numFollowing"
select="count(following-sibling::comment())"/>
<xsl:if test="$thisPosition + $numFollowing > 0">
<xsl:value-of select="concat('[', $thisPosition +
1, ']')"/>
</xsl:if>
</xsl:element>
</xsl:when>
<!-- Namespace: -->
<xsl:when test="count(. | ../namespace::*) =
count(../namespace::*)">
<xsl:variable name="apos">'</xsl:variable>
<xsl:element name="nodeName">
<xsl:value-of select="concat('namespace::*',
'[local-name() = ', $apos, local-name(), $apos, ']')"/>
</xsl:element>
</xsl:when>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:variable>
<xsl:value-of select="msxsl:node-set($theResult)"/>
</xsl:template>
</xsl:stylesheet>
xml source (buildPath.xml):
<!-- top level Comment -->
<root>
<nodeA>textA</nodeA>
<nodeA id="nodeA-2">
<?myProc ?>
xxxxxxxx
<nodeB/>
<nodeB xmlns:myNamespace="myTestNamespace">
<!-- Comment within /root/nodeA[2]/nodeB[2] -->
<nodeC/>
<!-- 2nd Comment within /root/nodeA[2]/nodeB[2] -->
</nodeB>
yyyyyyy
<nodeB/>
<?myProc2 ?>
</nodeA>
</root>
<!-- top level Comment -->
Result:
/root/nodeA[2]/nodeB[2]/namespace::*[local-name() = 'myNamespace']
/root/nodeA[2]/nodeB[2]/nodeC/namespace::*[local-name() =
'myNamespace']

Here is how this can be done with SAX:
import java.util.HashMap;
import java.util.Map;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
public class FragmentContentHandler extends DefaultHandler {
private String xPath = "/";
private XMLReader xmlReader;
private FragmentContentHandler parent;
private StringBuilder characters = new StringBuilder();
private Map<String, Integer> elementNameCount = new HashMap<String, Integer>();
public FragmentContentHandler(XMLReader xmlReader) {
this.xmlReader = xmlReader;
}
private FragmentContentHandler(String xPath, XMLReader xmlReader, FragmentContentHandler parent) {
this(xmlReader);
this.xPath = xPath;
this.parent = parent;
}
#Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
Integer count = elementNameCount.get(qName);
if(null == count) {
count = 1;
} else {
count++;
}
elementNameCount.put(qName, count);
String childXPath = xPath + "/" + qName + "[" + count + "]";
int attsLength = atts.getLength();
for(int x=0; x<attsLength; x++) {
System.out.println(childXPath + "[#" + atts.getQName(x) + "='" + atts.getValue(x) + ']');
}
FragmentContentHandler child = new FragmentContentHandler(childXPath, xmlReader, this);
xmlReader.setContentHandler(child);
}
#Override
public void endElement(String uri, String localName, String qName) throws SAXException {
String value = characters.toString().trim();
if(value.length() > 0) {
System.out.println(xPath + "='" + characters.toString() + "'");
}
xmlReader.setContentHandler(parent);
}
#Override
public void characters(char[] ch, int start, int length) throws SAXException {
characters.append(ch, start, length);
}
}
It can be tested with:
import java.io.FileInputStream;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
public class Demo {
public static void main(String[] args) throws Exception {
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
xr.setContentHandler(new FragmentContentHandler(xr));
xr.parse(new InputSource(new FileInputStream("input.xml")));
}
}
This will produce the desired output:
//root[1]/elemA[1]='one'
//root[1]/elemA[2][#attribute1='first]
//root[1]/elemA[2][#attribute2='second]
//root[1]/elemA[2]='two'
//root[1]/elemB[1]='three'
//root[1]/elemA[3]='four'
//root[1]/elemC[1]/elemB[1]='five'

With jOOX (a jquery API port to Java, disclaimer - I work for the company behind the library), you can almost achieve what you want in a single statement:
// I'm assuming this:
import static org.joox.JOOX.$;
// And then...
List<String> coolList = $(document).xpath("//*[not(*)]").map(
context -> $(context).xpath() + "='" + $(context).text() + "'"
);
If document is your sample document:
<root>
<elemA>one</elemA>
<elemA attribute1='first' attribute2='second'>two</elemA>
<elemB>three</elemB>
<elemA>four</elemA>
<elemC>
<elemB>five</elemB>
</elemC>
</root>
This will produce
/root[1]/elemA[1]='one'
/root[1]/elemA[2]='two'
/root[1]/elemB[1]='three'
/root[1]/elemA[3]='four'
/root[1]/elemC[1]/elemB[1]='five'
By "almost", I mean that jOOX does not (yet) support matching/mapping attributes. Hence, your attributes will not produce any output. This will be implemented in the near future, though.

private static void buildEntryList( List<String> entries, String parentXPath, Element parent ) {
NamedNodeMap attrs = parent.getAttributes();
for( int i = 0; i < attrs.getLength(); i++ ) {
Attr attr = (Attr)attrs.item( i );
//TODO: escape attr value
entries.add( parentXPath+"[#"+attr.getName()+"='"+attr.getValue()+"']");
}
HashMap<String, Integer> nameMap = new HashMap<String, Integer>();
NodeList children = parent.getChildNodes();
for( int i = 0; i < children.getLength(); i++ ) {
Node child = children.item( i );
if( child instanceof Text ) {
//TODO: escape child value
entries.add( parentXPath+"='"+((Text)child).getData()+"'" );
} else if( child instanceof Element ) {
String childName = child.getNodeName();
Integer nameCount = nameMap.get( childName );
nameCount = nameCount == null ? 1 : nameCount + 1;
nameMap.put( child.getNodeName(), nameCount );
buildEntryList( entries, parentXPath+"/"+childName+"["+nameCount+"]", (Element)child);
}
}
}
public static List<String> getEntryList( Document doc ) {
ArrayList<String> entries = new ArrayList<String>();
Element root = doc.getDocumentElement();
buildEntryList(entries, "/"+root.getNodeName()+"[1]", root );
return entries;
}
This code works with two assumptions: you aren't using namespaces and there are no mixed content elements. The namespace limitation isn't a serious one, but it'd make your XPath expression much harder to read, as every element would be something like *:<name>[namespace-uri()='<nsuri>'][<index>], but otherwise it's easy to implement. Mixed content on the other hand would make the use of xpath very tedious, as you'd have to be able to individually address the second, third and so on text node within an element.

use w3c.dom
go recursively down
for each node there is easy way to get it's xpath: either by storing it as array/list while #2, or via function which goes recursively up until parent is null, then reverses array/list of encountered nodes.
something like that.
UPD:
and concatenate final list in order to get final xpath.
don't think attributes will be a problem.

I've done a similar task once. The main idea used was that you can use indexes of the element in the xpath. For example in the following xml
<root>
<el />
<something />
<el />
</root>
xpath to the second <el/> will be /root[1]/el[2] (xpath indexes are 1-based). This reads as "take the first root, then take the second one from all elements with the name el". So element something does not affect indexing of elements el. So you can in theory create an xpath for each specific element in your xml. In practice I've accomplished this by walking the tree recursevely and remembering information about elements and their indexes along the way.
Creating xpath referencing specific attribute of the element then was just adding '/#attrName' to element's xpath.

I have written a method to return the absolute path of an element in the Practical XML library. To give you an idea of how it works, here's an extract form one of the unit tests:
assertEquals("/root/wargle[2]/zargle",
DomUtil.getAbsolutePath(child3a));
So, you could recurse through the document, apply your tests, and use this to return the XPath. Or, what is probably better, is that you could use the XPath-based assertions from that same library.

I did the exact same thing last week for processing my xml to solr compliant format.
Since you wanted a pseudo code: This is how I accomplished that.
// You can skip the reference to parent and child.
1_ Initialize a custom node object: NodeObjectVO {String nodeName, String path, List attr, NodeObjectVO parent, List child}
2_ Create an empty list
3_ Create a dom representation of xml and iterate thro the node. For each node, get the corresponding information. All the information like Node name,attribute names and value should be readily available from dom object. ( You need to check the dom NodeType, code should ignore processing instruction and plain text nodes.)
// Code Bloat warning.
4_ The only tricky part is get path. I created an iterative utility method to get the xpath string from NodeElement. (While(node.Parent != null ) { path+=node.parent.nodeName}.
(You can also achieve this by maintaining a global path variable, that keeps track of the parent path for each iteration.)
5_ In the setter method of setAttributes (List), I will append the object's path with all the available attributes. (one path with all available attributes. Not a list of path with each possible combination of attributes. You might want to do someother way. )
6_ Add the NodeObjectVO to the list.
7_ Now we have a flat (not hierrarchial) list of custom Node Objects, that have all the information I need.
(Note: Like I mentioned, I maintain parent child relationship, you should probably skip that part. There is a possibility of code bloating, especially while getparentpath. For small xml this was not a problem, but this is a concern for large xml).

Related

Iterate ArrayList returned by Java extension function in XSLT(Saxon)

I have a program to perform XML mapping using XSLT. I'm using Saxon-HE-9.7 library for this. I'm also using reflexive extension functions in XSLT.
The XSLT calls a java function that returns ArrayList<HashMap<String, String>>
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0" xmlns:SQLService="com.db.SQLService" xmlns:ArrayList="java:java.util.ArrayList" xmlns:HashMap="java.util.HashMap" >
<xsl:output method="xml" indent="yes" />
<xsl:variable name="city">Texas</xsl:variable>
<xsl:variable name="query" select="'Select name, emp_id from employee where city = ?'" />
<xsl:variable name="list" select="SQLService:executeQueryMultiResult($query, $city)" />
<xsl:template match="/">
<test>
<xsl:for-each select="abc/company[#type='product']">
<employee>
<xsl:attribute name="details">
<xsl:value-of select="$list" />
</xsl:attribute>
</employee>
</xsl:for-each>
</test>
</xsl:template>
</xsl:stylesheet>
I'm getting only sinlge record in the list which is the last record of the list returned by executeQueryMultiResult.
I want to store and iterate all the elements of the list?

Firstly, I'm a bit surprised that when you iterate over abc/company[#type='product'], the body of the xsl:for-each doesn't depend in any way on the current selected company. This means that each iteration of this loop will produce exactly the same output.
Under the default Java-to-XPath conversions, the ArrayList should be converted to an XPath sequence, but the java Maps will not be converted to XPath maps; they need to be accessed as external objects.
See what count($list) returns and check that it matches your expectations.
LATER
I am unable to reproduce the problem. I tested it like this:
public void testListOfMaps() {
try {
Processor p = new Processor(true);
XsltCompiler c = p.newXsltCompiler();
String s = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
"<xsl:stylesheet version=\"3.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\" xmlns:xs=\"http://www.w3.org/2001/XMLSchema\"\n" +
" xmlns:jf=\"java:s9apitest.TestReflexionHelper\">\n" +
" <xsl:output method=\"text\" />\n" +
" <xsl:template name='main'>\n" +
" <xsl:variable name=\"theList\" select=\"jf:getListOfMaps('Weds', 'Wednesday')\" />\n" +
" <xsl:value-of select=\"count($theList)\" />\n" +
" <xsl:value-of select=\"Q{java:java.util.Map}get($theList[1], 'Weds')\" />\n" +
" </xsl:template>\n" +
"</xsl:stylesheet>";
XsltTransformer t = c.compile(new StreamSource(new StringReader(s))).load();
StringWriter out = new StringWriter();
Serializer ser = p.newSerializer(out);
t.setDestination(ser);
t.setInitialTemplate(new QName("main"));
t.transform();
assertTrue(out.toString().equals("2Wednesday"));
} catch (SaxonApiException e1) {
fail(e1.getMessage());
}
}
where the extension function jf:getListOfMaps() is:
public static List<Map<String, String>> getListOfMaps(String x, String y) {
Map<String, String> m = new HashMap<>();
m.put("Mon", "Monday");
m.put("Tues", "Tuesday");
m.put(x, y);
Map<String, String> n = new HashMap<>();
m.put("Jan", "January");
m.put("Feb", "February");
List<Map<String, String>> list = new ArrayList<>();
list.add(m);
list.add(n);
return list;
}
The test demonstrates that Saxon is behaving according to the spec: the Java List of Maps is converted to an XPath sequence of external objects, where the external object is a wrapper around the Java Map that allows use of the underlying Java methods.
I ran this on Saxon 9.9 (9.7 is no longer supported).
I suggest you try and produce a repro that simplifies the problem by replacing your extension function with a dummy stub with the same signature that anyone can run for testing.
I also suggest you tell us exactly what your environment is. I'm a bit puzzled that you say you are using Saxon-HE, because Saxon-HE doesn't support reflexive extension functions.

is there any way other than using Xpath for this?

hello guys i'am writing this program:
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
public class DOMbooks {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = factory.newDocumentBuilder();
File file = new File("books-fixed.xml");
Document doc = docBuilder.parse(file);
NodeList list = doc.getElementsByTagName("*");
int bookCounter = 1;
for (int i = 1; i < list.getLength(); i++) {
Element element = (Element)list.item(i);
String nodeName = element.getNodeName();
if (nodeName.equals("book")) {
bookCounter++;
System.out.println("BOOK " + bookCounter);
String isbn = element.getAttribute("sequence");
System.out.println("\tsequence:\t" + isbn);
}
else if (nodeName.equals("author")) {
System.out.println("\tAuthor:\t" + element.getChildNodes().item(0).getNodeValue());
}
else if (nodeName.equals("title")) {
System.out.println("\tTitle:\t" + element.getChildNodes().item(0).getNodeValue());
}
else if (nodeName.equals("publishYear")) {
System.out.println("\tpublishYear:\t" + element.getChildNodes().item(0).getNodeValue());
}
else if (nodeName.equals("genre")) {
System.out.println("\tgenre:\t" + element.getChildNodes().item(0).getNodeValue());
}
}
}
}
i want to print all the data about the "Science Fiction" books.. i know i should use Xpath but it's stuck, with too much errors... any suggestions?
assuming that i have this table and i only want to select science fiction books with all their info
<book sequence="5">
<title>Aftershock</title>
<auther>Robert B. Reich</auther>
<publishYear>2010</publishYear>
<genre>Economics</genre>
</book>
- <book sequence="6">
<title>The Time Machine</title>
<auther>H.G. Wells</auther>
<publishYear>1895</publishYear>
<genre>Science Fiction</genre>
assuming i have this table i only want to print the Science Fiction books with all their info...

i want to print all the data about the "Science Fiction" books.. i know i should use Xpath but it's stuck,
I assume you'd mean that you want all the books for which genre == "Science Fiction", right? In that case, XPath is really much simpler than whatever you were trying in Java (you don't show the root note, so I'll start with '//', which selects at any depth):
//book[genre = 'Science Fiction']
XSLT approach to simplify things
Now, having another look at your code, it looks like you want to print each and every element, including the element's name. This is more trivially done in XSLT:
<!-- every XSLT 1.0 must start like this -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<!-- you want text -->
<xsl:output method="text" />
<!-- match any science fiction book (your primary goal) -->
<xsl:template match="book[genre = 'Science Fiction']">
<xsl:text>BOOK </xsl:text>
<xsl:value-of select="position()" />
<!-- send the children and attribute to be processed by templates -->
<xsl:apply-templates select="#sequence | *" />
</xsl:template>
<!-- "catch" any elements or attributes under <book> -->
<xsl:template match="book/* | book/#*">
<!-- a newline and a tab per line-->
<xsl:text>
</xsl:text>
<!-- and the name of the element or attribute -->
<xsl:value-of select="local-name()" />
<!-- another tab, plus contents of the element or attribute -->
<xsl:text> </xsl:text>
<xsl:value-of select="." />
</xsl:template>
<!-- make sure that other values are ignored, but process children -->
<xsl:template match="node()">
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
You can use this code, which is significantly shorter (if you ignore the comments and whitespace) and (arguably, once you get the hang of it) more readable than your original code. To use it:
Store it as books.xsl
Then, simply use this (copied and changed from here):
import javax.xml.transform.*;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;
public class TestMain {
public static void main(String[] args) throws IOException, URISyntaxException, TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource(new File("books.xsl"));
Transformer transformer = factory.newTransformer(xslt);
Source text = new StreamSource(new File("books-fixed.xml"));
transformer.transform(text, new StreamResult(new File("output.txt")));
}
}
XPath 2.0
If you can use Saxon in Java, the above becomes a one-liner with XPath 2.0 and you don't even need XSLT:
for $book in //book[genre = 'Science Fiction']
return (
'BOOK',
count(//book[genre = 'Science Fiction'][. << $book]) + 1,
for $tag in $book/(#sequence | *)
return $tag/local-name(), ':', string($tag)
)

Xpath - Java - Extracting multiple namespaces from XML

I am working on a parser written in Java. I can receive XML feeds from various locations, with various contents. I need to extract all the namespaces from the feed, to call this or that according to the feed. I have some trouble obtaining this in Java, and i am not really sure where the issue is.
Let's consider this XML:
<?xml version="1.0"?>
<?xml-stylesheet type='text/xsl' href='new.xsl'?>
<test xmlns:mynsone="http://www.ns.com/test" xmlns:demons="http://www.demons.com/test">
<p xmlns:domain="http://www.toto.com/test">
this is a test.
</p>
</test>
In order to test my xPath expression (i am rather new to it), i wrote a little .xsl script applied to that XML:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output
method="html"
encoding="ISO-8859-1"
doctype-public="-//W3C//DTD XHTML//EN"
doctype-system="http://www.w3.org/TR/2001/REC-xhtml11-20010531"
indent="yes" />
<xsl:template match="/">
<xsl:for-each select="//namespace::*">
<xsl:value-of select="." />
<xsl:text> </xsl:text><br />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
And this correctly provides me the list of namespaces encountered iterating the nodes:
http://www.w3.org/XML/1998/namespace
http://www.demons.com/test
http://www.ns.com/test
http://www.w3.org/XML/1998/namespace
http://www.demons.com/test
http://www.ns.com/test
http://www.toto.com/test
Now i get back to Java: here is the code i use.
InputStream file = url.openStream();
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
org.w3c.dom.Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "//namespace::*";
System.out.println(expression);
NodeList nodelist = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int k = 0; k < nodelist.getLength(); k++)
{
Node mynode = nodelist.item(k);
System.out.println(mynode.toString());
}
And here is the result i obtain:
xmlns:mynsone="http://www.ns.com/test"
org.apache.xml.dtm.ref.dom2dtm.DOM2DTMdefaultNamespaceDeclarationNode#7dbb8ca4
xmlns:domain="http://www.toto.com/test"
Therefore, the "demons" namespace is not returned. The problem is that if i put several namespaces on 1 node, only 1 is return in Java, whereas on the XSL script all are displayed.
I hope i maed myself clear; i spent the past days on the web browsing for examples, and i dont know if im really close but just missing a little something or if my expression is simply not proper..
Thanks in advance.
OK so i eventually used xPath 2.0 to do it, using saxon-HE 9.4:
public static boolean detectGeoRssNamespace(InputStream sourceFeed) {
try {
if (sourceFeed.markSupported()) {
sourceFeed.reset();
}
String objectModel = NamespaceConstant.OBJECT_MODEL_SAXON;
System.setProperty("javax.xml.xpath.XPathFactory:"+NamespaceConstant.OBJECT_MODEL_SAXON, "net.sf.saxon.xpath.XPathFactoryImpl");
XPathFactory xpathFactory = XPathFactory.newInstance(objectModel);
XPath xpath = xpathFactory.newXPath();
InputSource is = new InputSource(sourceFeed);
SAXSource ss = new SAXSource(is);
NodeInfo doc = ((XPathEvaluator)xpath).setSource(ss);
String xpathExpressionStr = "distinct-values(//*[name()!=local-name()]/ concat('prefix=', substring-before(name(), ':'), '&uri=', namespace-uri()))";
XPathExpression xpathExpression = xpath.compile(xpathExpressionStr);
List nodelist = (List)xpathExpression.evaluate(doc, XPathConstants.NODESET);
System.out.println("<output>");
Iterator iter = nodelist.iterator();
while ( iter.hasNext() ) {
Object line = (Object)iter.next();
System.out.println(line.toString());
}
System.out.println("</output>");
} catch (XPathFactoryConfigurationException e) {
e.printStackTrace();
} catch (XPathException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}

what if you extract this namespaces to different xml elements.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output
method="html"
encoding="ISO-8859-1"
doctype-public="-//W3C//DTD XHTML//EN"
doctype-system="http://www.w3.org/TR/2001/REC-xhtml11-20010531"
indent="yes" />
<xsl:template match="/">
<xsl:for-each select="//namespace::*">
<namespace>
<xsl:value-of select="." />
</namespace>
<xsl:text> </xsl:text><br />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Question was solved using xPath 2.0 (code included in question)

How to Generate an XML File from a set of XPath Expressions?

I want to be able to generate a complete XML file, given a set of XPath mappings.
The input could specified in two mappings: (1) One which lists the XPath expressions and values; and (2) the other which defines the appropriate namespaces.
/create/article[1]/id => 1
/create/article[1]/description => bar
/create/article[1]/name[1] => foo
/create/article[1]/price[1]/amount => 00.00
/create/article[1]/price[1]/currency => USD
/create/article[2]/id => 2
/create/article[2]/description => some name
/create/article[2]/name[1] => some description
/create/article[2]/price[1]/amount => 00.01
/create/article[2]/price[1]/currency => USD
For namespaces:
/create => xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/
/create/article => xmlns:ns1='http://predic8.com/material/1/‘
/create/article/price => xmlns:ns1='http://predic8.com/common/1/‘
/create/article/id => xmlns:ns1='http://predic8.com/material/1/'
Note also, that it is important that I also deal with XPath Attributes expressions as well. For example: I should also be able to handle attributes, such as:
/create/article/#type => richtext
The final output should then look something like:
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/'>
<ns1:article xmlns:ns1='http://predic8.com/material/1/‘ type='richtext'>
<name>foo</name>
<description>bar</description>
<ns1:price xmlns:ns1='http://predic8.com/common/1/'>
<amount>00.00</amount>
<currency>USD</currency>
</ns1:price>
<ns1:id xmlns:ns1='http://predic8.com/material/1/'>1</ns1:id>
</ns1:article>
<ns1:article xmlns:ns1='http://predic8.com/material/2/‘ type='richtext'>
<name>some name</name>
<description>some description</description>
<ns1:price xmlns:ns1='http://predic8.com/common/2/'>
<amount>00.01</amount>
<currency>USD</currency>
</ns1:price>
<ns1:id xmlns:ns1='http://predic8.com/material/2/'>2</ns1:id>
</ns1:article>
</ns1:create>
PS: This is a more detailed question to a previous question asked, although due to a series of further requirements and clarifications, I was recommended to ask a more broader question in order to address my needs.
Note also, I am implementing this in Java. So either a Java-based or XSLT-based solution would both be perfectly acceptable. thnx.
Further note: I am really looking for a generic solution. The XML shown above is just an example.

This problem has an easy solution if one builds upon the solution of the previous problem:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kNSFor" match="namespace" use="#of"/>
<xsl:variable name="vStylesheet" select="document('')"/>
<xsl:variable name="vPop" as="element()*">
<item path="/create/article/#type">richtext</item>
<item path="/create/article/#lang">en-us</item>
<item path="/create/article[1]/id">1</item>
<item path="/create/article[1]/description">bar</item>
<item path="/create/article[1]/name[1]">foo</item>
<item path="/create/article[1]/price[1]/amount">00.00</item>
<item path="/create/article[1]/price[1]/currency">USD</item>
<item path="/create/article[1]/price[2]/amount">11.11</item>
<item path="/create/article[1]/price[2]/currency">AUD</item>
<item path="/create/article[2]/id">2</item>
<item path="/create/article[2]/description">some name</item>
<item path="/create/article[2]/name[1]">some description</item>
<item path="/create/article[2]/price[1]/amount">00.01</item>
<item path="/create/article[2]/price[1]/currency">USD</item>
<namespace of="create" prefix="ns1:"
url="http://predic8.com/wsdl/material/ArticleService/1/"/>
<namespace of="article" prefix="ns1:"
url="xmlns:ns1='http://predic8.com/material/1/"/>
<namespace of="#lang" prefix="xml:"
url="http://www.w3.org/XML/1998/namespace"/>
<namespace of="price" prefix="ns1:"
url="xmlns:ns1='http://predic8.com/material/1/"/>
<namespace of="id" prefix="ns1:"
url="xmlns:ns1='http://predic8.com/material/1/"/>
</xsl:variable>
<xsl:template match="/">
<xsl:sequence select="my:subTree($vPop/#path/concat(.,'/',string(..)))"/>
</xsl:template>
<xsl:function name="my:subTree" as="node()*">
<xsl:param name="pPaths" as="xs:string*"/>
<xsl:for-each-group select="$pPaths" group-adjacent=
"substring-before(substring-after(concat(., '/'), '/'), '/')">
<xsl:if test="current-grouping-key()">
<xsl:choose>
<xsl:when test=
"substring-after(current-group()[1], current-grouping-key())">
<xsl:variable name="vLocal-name" select=
"substring-before(concat(current-grouping-key(), '['), '[')"/>
<xsl:variable name="vNamespace"
select="key('kNSFor', $vLocal-name, $vStylesheet)"/>
<xsl:choose>
<xsl:when test="starts-with($vLocal-name, '#')">
<xsl:attribute name=
"{$vNamespace/#prefix}{substring($vLocal-name,2)}"
namespace="{$vNamespace/#url}">
<xsl:value-of select=
"substring(
substring-after(current-group(), current-grouping-key()),
2
)"/>
</xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:element name="{$vNamespace/#prefix}{$vLocal-name}"
namespace="{$vNamespace/#url}">
<xsl:sequence select=
"my:subTree(for $s in current-group()
return
concat('/',substring-after(substring($s, 2),'/'))
)
"/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="current-grouping-key()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:for-each-group>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used), the wanted, correct result is produced:
<ns1:create xmlns:ns1="http://predic8.com/wsdl/material/ArticleService/1/">
<ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/" type="richtext"
xml:lang="en-us"/>
<ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/">
<ns1:id>1</ns1:id>
<description>bar</description>
<name>foo</name>
<ns1:price>
<amount>00.00</amount>
<currency>USD</currency>
</ns1:price>
<ns1:price>
<amount>11.11</amount>
<currency>AUD</currency>
</ns1:price>
</ns1:article>
<ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/">
<ns1:id>2</ns1:id>
<description>some name</description>
<name>some description</name>
<ns1:price>
<amount>00.01</amount>
<currency>USD</currency>
</ns1:price>
</ns1:article>
</ns1:create>
Explanation:
A reasonable assumption is made that throughout the generated document any two elements with the same local-name() belong to the same namespace -- this covers the predominant majority of real-world XML documents.
The namespace specifications follow the path specifications. A nsmespace specification has the form: <namespace of="target element's local-name" prefix="wanted prefix" url="namespace-uri"/>
Before generating an element with xsl:element, the appropriate namespace specification is selected using an index created by an xsl:key. From this namespace specification the values of its prefix and url attributes are used in specifying in the xsl:element instruction the values of the full element name and the element's namespace-uri.

Interesting question. Let's assume that your input set of XPath expressions satisfies some reasonsable constraints, for example if there is an X/article[2] then there also (preceding it) an X/article[1]. And let's put the namespace part of the problem to one side for the moment.
Let's go for an XSLT 2.0 solution: we'll start with the input in the form
<paths>
<path value="1">/create/article[1]/id</path>
<path value="bar">/create/article[1]/description</path>
</paths>
and then we'll turn this into
<paths>
<path value="1"><step>create</step><step>article[1]</step><step>id</step></path>
...
</paths>
Now we'll call a function which does a grouping on the first step, and calls itself recursively to do grouping on the next step:
<xsl:function name="f:group">
<xsl:param name="paths" as="element(path)*"/>
<xsl:param name="step" as="xs:integer"/>
<xsl:for-each-group select="$paths" group-by="step[$step]">
<xsl:element name="{replace(current-grouping-key(), '\[.*', '')}">
<xsl:choose>
<xsl:when test="count(current-group) gt 1">
<xsl:sequence select="f:group(current-group(), $step+1)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="current-group()[1]/#value"/>
</xsl:otherwise>
</xsl:choose>
</xsl:element>
</xsl:for-each-group>
</xsl:function>
That's untested, and there may well be details you have to adjust to get it working. But I think the basic approach should work.
The namespace part of the problem is perhaps best tackled by preprocessing the list of paths to add a namespace attribute to each step element; this can then be used in the xsl:element instruction to put the element in the right namespace.

i came across a similar situation where i had to convert Set of XPath/FQN - value mappings to XML. A generic simple solution can be using the following code, which can be enhanced to specific requirements.
public class XMLUtils {
static public String transformToXML(Map<String, String> pathValueMap, String delimiter)
throws ParserConfigurationException, TransformerException {
DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
Element rootElement = null;
Iterator<Entry<String, String>> it = pathValueMap.entrySet().iterator();
while (it.hasNext()) {
Entry<String, String> pair = it.next();
if (pair.getKey() != null && pair.getKey() != "" && rootElement == null) {
String[] pathValuesplit = pair.getKey().split(delimiter);
rootElement = document.createElement(pathValuesplit[0]);
break;
}
}
document.appendChild(rootElement);
Element rootNode = rootElement;
Iterator<Entry<String, String>> iterator = pathValueMap.entrySet().iterator();
while (iterator.hasNext()) {
Entry<String, String> pair = iterator.next();
if (pair.getKey() != null && pair.getKey() != "" && rootElement != null) {
String[] pathValuesplit = pair.getKey().split(delimiter);
if (pathValuesplit[0].equals(rootElement.getNodeName())) {
int i = pathValuesplit.length;
Element parentNode = rootNode;
int j = 1;
while (j < i) {
Element child = null;
NodeList childNodes = parentNode.getChildNodes();
for (int k = 0; k < childNodes.getLength(); k++) {
if (childNodes.item(k).getNodeName().equals(pathValuesplit[j])
&& childNodes.item(k) instanceof Element) {
child = (Element) childNodes.item(k);
break;
}
}
if (child == null) {
child = document.createElement(pathValuesplit[j]);
if (j == (i - 1)) {
child.appendChild(
document.createTextNode(pair.getValue() == null ? "" : pair.getValue()));
}
}
parentNode.appendChild(child);
parentNode = child;
j++;
}
} else {
// ignore any other root - add logger
System.out.println("Data not processed for node: " + pair.getKey());
}
}
}
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource domSource = new DOMSource(document);
// to return a XMLstring in response to an API
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
StreamResult resultToFile = new StreamResult(new File("C:/EclipseProgramOutputs/GeneratedXMLFromPathValue.xml"));
transformer.transform(domSource, resultToFile);
transformer.transform(domSource, result);
return writer.toString();
}
public static void main(String args[])
{
Map<String, String> pathValueMap = new HashMap<String, String>();
String delimiter = "/";
pathValueMap.put("create/article__1/id", "1");
pathValueMap.put("create/article__1/description", "something");
pathValueMap.put("create/article__1/name", "Book Name");
pathValueMap.put("create/article__1/price/amount", "120" );
pathValueMap.put("create/article__1/price/currency", "INR");
pathValueMap.put("create/article__2/id", "2");
pathValueMap.put("create/article__2/description", "something else");
pathValueMap.put("create/article__2/name", "Book name 1");
pathValueMap.put("create/article__2/price/amount", "2100");
pathValueMap.put("create/article__2/price/currency", "USD");
try {
XMLUtils.transformToXML(pathValueMap, delimiter);
} catch (ParserConfigurationException | TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}}
Output:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<create>
<article__1>
<id>1</id>
<name>Book Name</name>
<description>something</description>
<price>
<currency>INR</currency>
<amount>120</amount>
</price>
</article__1>
<article__2>
<description>something else</description>
<name>Book name 1</name>
<id>2</id>
<price>
<currency>USD</currency>
<amount>2100</amount>
</price>
</article__2>
To remove __%num , can use regular expressions on final string. like:
resultString = resultString.replaceAll("(__[0-9][0-9])|(__[0-9])", "");
This would do the cleaning job

How do I remove namespaces from xml, using java dom?

I have the following code
DocumentBuilderFactory dbFactory_ = DocumentBuilderFactory.newInstance();
Document doc_;
DocumentBuilder dBuilder = dbFactory_.newDocumentBuilder();
StringReader reader = new StringReader(s);
InputSource inputSource = new InputSource(reader);
doc_ = dBuilder.parse(inputSource);
doc_.getDocumentElement().normalize();
Then I can do
doc_.getDocumentElement();
and get my first element but the problem is instead of being job the element is tns:job.
I know about and have tried to use:
dbFactory_.setNamespaceAware(true);
but that is just not what I'm looking for, I need something to completely get rid of namespaces.
Any help would be appreciated,
Thanks,
Josh

Use the Regex function. This will solve this issue:
public static String removeXmlStringNamespaceAndPreamble(String xmlString) {
return xmlString.replaceAll("(<\\?[^<]*\\?>)?", ""). /* remove preamble */
replaceAll("xmlns.*?(\"|\').*?(\"|\')", "") /* remove xmlns declaration */
.replaceAll("(<)(\\w+:)(.*?>)", "$1$3") /* remove opening tag prefix */
.replaceAll("(</)(\\w+:)(.*?>)", "$1$3"); /* remove closing tags prefix */
}

For Element and Attribute nodes:
Node node = ...;
String name = node.getLocalName();
will give you the local part of the node's name.
See Node.getLocalName()

You can pre-process XML to remove all namespaces, if you absolutely must do so. I'd recommend against it, as removing namespaces from an XML document is in essence comparable to removing namespaces from a programming framework or library - you risk name clashes and lose the ability to differentiate between once-distinct elements. However, it's your funeral. ;-)
This XSLT transformation removes all namespaces from any XML document.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Apply it to your XML document. Java examples for doing such a thing should be plenty, even on this site. The resulting document will be exactly of the same structure and layout, just without namespaces.

Rather than
dbFactory_.setNamespaceAware(true);
Use
dbFactory_.setNamespaceAware(false);
Although I agree with Tomalak: in general, namespaces are more helpful than harmful. Why don't you want to use them?
Edit: this answer doesn't answer the OP's question, which was how to get rid of namespace prefixes. RD01 provided the correct answer to that.

Tomalak, one fix of your XSLT (in 3rd template):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node() | #*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<!-- Here! -->
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

public static void wipeRootNamespaces(Document xml) {
Node root = xml.getDocumentElement();
NodeList rootchildren = root.getChildNodes();
Element newroot = xml.createElement(root.getNodeName());
for (int i=0;i<rootchildren.getLength();i++) {
newroot.appendChild(rootchildren.item(i).cloneNode(true));
}
xml.replaceChild(newroot, root);
}

The size of the input xml also needs to be considered when choosing the solution. For large xmls, in the size of ~100k, possible if your input is from a web service, you also need to consider the garbage collection implications when you manipulate a large string. We used String.replaceAll before, and it caused frequent OOM in production with a 1.5G heap size because of the way replaceAll is implemented.
You can reference http://app-inf.blogspot.com/2013/04/pitfalls-of-handling-large-string.html for our findings.
I am not sure how XSLT deals with large String objects, but we ended up parsing the string manualy to remove prefixes in one parse to avoid creating additional large java objects.
public static String removePrefixes(String input1) {
String ret = null;
int strStart = 0;
boolean finished = false;
if (input1 != null) {
//BE CAREFUL : allocate enough size for StringBuffer to avoid expansion
StringBuffer sb = new StringBuffer(input1.length());
while (!finished) {
int start = input1.indexOf('<', strStart);
int end = input1.indexOf('>', strStart);
if (start != -1 && end != -1) {
// Appending anything before '<', including '<'
sb.append(input1, strStart, start + 1);
String tag = input1.substring(start + 1, end);
if (tag.charAt(0) == '/') {
// Appending '/' if it is "</"
sb.append('/');
tag = tag.substring(1);
}
int colon = tag.indexOf(':');
int space = tag.indexOf(' ');
if (colon != -1 && (space == -1 || colon < space)) {
tag = tag.substring(colon + 1);
}
// Appending tag with prefix removed, and ">"
sb.append(tag).append('>');
strStart = end + 1;
} else {
finished = true;
}
}
//BE CAREFUL : use new String(sb) instead of sb.toString for large Strings
ret = new String(sb);
}
return ret;
}

Instead of using TransformerFactory and then calling transform on it (which was injecting the empty namespace, I transformed as follows:
OutputStream outputStream = new FileOutputStream(new File(xMLFilePath));
OutputFormat outputFormat = new OutputFormat(doc, "UTF-8", true);
outputFormat.setOmitComments(true);
outputFormat.setLineWidth(0);
XMLSerializer serializer = new XMLSerializer(outputStream, outputFormat);
serializer.serialize(doc);
outputStream.close();

I also faced the namespace issue and was unable to read XML file in java. below is the solution:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);// this is imp code that will deactivate namespace in xml
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("XML/"+ fileName);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.