I am working on a parser written in Java. I can receive XML feeds from various locations, with various contents. I need to extract all the namespaces from the feed, to call this or that according to the feed. I have some trouble obtaining this in Java, and i am not really sure where the issue is.
Let's consider this XML:
<?xml version="1.0"?>
<?xml-stylesheet type='text/xsl' href='new.xsl'?>
<test xmlns:mynsone="http://www.ns.com/test" xmlns:demons="http://www.demons.com/test">
<p xmlns:domain="http://www.toto.com/test">
this is a test.
</p>
</test>
In order to test my xPath expression (i am rather new to it), i wrote a little .xsl script applied to that XML:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output
method="html"
encoding="ISO-8859-1"
doctype-public="-//W3C//DTD XHTML//EN"
doctype-system="http://www.w3.org/TR/2001/REC-xhtml11-20010531"
indent="yes" />
<xsl:template match="/">
<xsl:for-each select="//namespace::*">
<xsl:value-of select="." />
<xsl:text> </xsl:text><br />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
And this correctly provides me the list of namespaces encountered iterating the nodes:
http://www.w3.org/XML/1998/namespace
http://www.demons.com/test
http://www.ns.com/test
http://www.w3.org/XML/1998/namespace
http://www.demons.com/test
http://www.ns.com/test
http://www.toto.com/test
Now i get back to Java: here is the code i use.
InputStream file = url.openStream();
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
org.w3c.dom.Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "//namespace::*";
System.out.println(expression);
NodeList nodelist = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int k = 0; k < nodelist.getLength(); k++)
{
Node mynode = nodelist.item(k);
System.out.println(mynode.toString());
}
And here is the result i obtain:
xmlns:mynsone="http://www.ns.com/test"
org.apache.xml.dtm.ref.dom2dtm.DOM2DTMdefaultNamespaceDeclarationNode#7dbb8ca4
xmlns:domain="http://www.toto.com/test"
Therefore, the "demons" namespace is not returned. The problem is that if i put several namespaces on 1 node, only 1 is return in Java, whereas on the XSL script all are displayed.
I hope i maed myself clear; i spent the past days on the web browsing for examples, and i dont know if im really close but just missing a little something or if my expression is simply not proper..
Thanks in advance.
OK so i eventually used xPath 2.0 to do it, using saxon-HE 9.4:
public static boolean detectGeoRssNamespace(InputStream sourceFeed) {
try {
if (sourceFeed.markSupported()) {
sourceFeed.reset();
}
String objectModel = NamespaceConstant.OBJECT_MODEL_SAXON;
System.setProperty("javax.xml.xpath.XPathFactory:"+NamespaceConstant.OBJECT_MODEL_SAXON, "net.sf.saxon.xpath.XPathFactoryImpl");
XPathFactory xpathFactory = XPathFactory.newInstance(objectModel);
XPath xpath = xpathFactory.newXPath();
InputSource is = new InputSource(sourceFeed);
SAXSource ss = new SAXSource(is);
NodeInfo doc = ((XPathEvaluator)xpath).setSource(ss);
String xpathExpressionStr = "distinct-values(//*[name()!=local-name()]/ concat('prefix=', substring-before(name(), ':'), '&uri=', namespace-uri()))";
XPathExpression xpathExpression = xpath.compile(xpathExpressionStr);
List nodelist = (List)xpathExpression.evaluate(doc, XPathConstants.NODESET);
System.out.println("<output>");
Iterator iter = nodelist.iterator();
while ( iter.hasNext() ) {
Object line = (Object)iter.next();
System.out.println(line.toString());
}
System.out.println("</output>");
} catch (XPathFactoryConfigurationException e) {
e.printStackTrace();
} catch (XPathException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
what if you extract this namespaces to different xml elements.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output
method="html"
encoding="ISO-8859-1"
doctype-public="-//W3C//DTD XHTML//EN"
doctype-system="http://www.w3.org/TR/2001/REC-xhtml11-20010531"
indent="yes" />
<xsl:template match="/">
<xsl:for-each select="//namespace::*">
<namespace>
<xsl:value-of select="." />
</namespace>
<xsl:text> </xsl:text><br />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Question was solved using xPath 2.0 (code included in question)
Related
currently I am working on XML modify task in my project for that I want to change xml element from source code below is my XML:
<Client_list>
<Description>
<ip>192.168.11.206</ip>
<name>vishal suhagiya</name>
</Description>
<Description>
<ip>192.168.11.205</ip>
<name>kinnari jasoliya</name>
</Description>
</Client_list>
I wrote in java like:
for (int i = 0; i < nodes.getLength(); i++) {
Element Description = (Element)nodes.item(i);
Node element = nodes.item(i);
Element ip = (Element)Description.getElementsByTagName("ip_address").item(0);
String pName = ip.getTextContent();
String Client = jTextField4.getText();
if (pName.equals(Client)) {
if("Name".equals(element.getNodeName()))
{
element.setTextContent(jTextField4.getText());
}
}
I need that if I want to change name of 192.168.11.205's then how can I change?
So how can I change name in XML based on ip address
Do the whole thing in XSLT.
<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="ip"/>
<xsl:param name="newName"/>
<xsl:template match="*">
<xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>
<xsl:template match="name[../ip=$ip]">
<name><xsl:value-of select="$newName"/></name>
</xsl:template>
</xsl:transform>
I use xpath to manage this kind of updates.
XPath xpath = XPathFactory.newInstance().newXPath();
try {
NodeList nodes = (NodeList) xpath.evaluate("//Description[ip='192.168.11.205']/name", doc, XPathConstants.NODESET);
for(int i=0;i<nodes.getLength();i++){
Element nameElement = (Element) nodes.item(i);
nameElement.setTextContent("NewValue");
}
} catch (XPathExpressionException e) {
// Some error management here
}
If you need to manage XML documents as java objects you may want to have a look at a JAXB tutorial.
Following is a snippet of .xml file. I did following :
Document doc = docBuilder.parse(filesInDirectory.get(i));
doc.getDocumentElement().normalize();
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression expr1 = xPath.compile("//codes[# class ='class2']/code[#code]");
Object result1 = expr1.evaluate(doc, XPathConstants.NODESET);
NodeList nodes1 = (NodeList) result1;
Now,
System.out.println("result length"+":"+nodes1.getLength());
returns 2.
I would like to make logical decision based on the attribute names, like(pseudocode)
if(nodes1.contains(123))
or
if(nodes1.contains(123) && nodes1.contains(456))
and make decision.
how would i do it?
<metadata>
<codes class="class1">
<code code="ABC">
<detail "blah" "blah">
</code>
</codes>
<codes class="class2">
<code code="123">
<detail "blah blah"/>
</code>
<code code="456">
<detail "blah blah"/>
</code>
</codes>
</metadata>
This:
XPathExpression expr1 = xPath.compile("//codes[#class]");
Object result1 = expr1.evaluate(doc, XPathConstants.NODESET);
NodeList nodes1 = (NodeList) result1;
should return you a list of elements with a class attribute. Iterate over this node list and foreach node extract the [#code] element and use a check like
if (node.getNodeValue().equals("123"))
to establish whether your node has the value you are looking for.
Try out this:
File f = new File("test.xml");
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
InputSource src = new InputSource(new FileInputStream(f));
Object result = xpath.evaluate("//codes[#class='class2']/code/#code",src,XPathConstants.NODESET);
NodeList lst = (NodeList)result;
List<String> codeList = new ArrayList<String>();
for(int idx=0; idx<lst.getLength(); idx++){
codeList.add(lst.item(idx).getNodeValue());
}
if(codeList.contains("123")){
System.out.println("123");
}
if(codeList.contains("123") && codeList.contains("456")){
System.out.println("123 and 456");
}
Explanation:
XPath //codes[#class='class2']/code/#code will collect all code values under codes with having class as class2.
You can then build a List from NodeList so that you can use contains() method.
Use this XPath expression:
/*/codes[#class]/code[#code = '123' or #code = '456']
It selects any code element whose code attribute's string value is one of the strings "123" or "456" and that (the code element) is a child of a codes element that has a `class attribute and is a child of the top element of the XML document.
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/codes[#class]/code[#code = '123' or #code = '456']"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document (corrected to be made well-formed):
<metadata>
<codes class="class1">
<code code="ABC">
<detail/>
</code>
</codes>
<codes class="class2">
<code code="123">
<detail />
</code>
<code code="456">
<detail />
</code>
</codes>
</metadata>
the XPath expression is evaluated and the selected nodes are copied to the output:
<code code="123">
<detail/>
</code>
<code code="456">
<detail/>
</code>
Explanation:
Proper use of the standard XPath operator or.
I want to be able to generate a complete XML file, given a set of XPath mappings.
The input could specified in two mappings: (1) One which lists the XPath expressions and values; and (2) the other which defines the appropriate namespaces.
/create/article[1]/id => 1
/create/article[1]/description => bar
/create/article[1]/name[1] => foo
/create/article[1]/price[1]/amount => 00.00
/create/article[1]/price[1]/currency => USD
/create/article[2]/id => 2
/create/article[2]/description => some name
/create/article[2]/name[1] => some description
/create/article[2]/price[1]/amount => 00.01
/create/article[2]/price[1]/currency => USD
For namespaces:
/create => xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/
/create/article => xmlns:ns1='http://predic8.com/material/1/‘
/create/article/price => xmlns:ns1='http://predic8.com/common/1/‘
/create/article/id => xmlns:ns1='http://predic8.com/material/1/'
Note also, that it is important that I also deal with XPath Attributes expressions as well. For example: I should also be able to handle attributes, such as:
/create/article/#type => richtext
The final output should then look something like:
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/'>
<ns1:article xmlns:ns1='http://predic8.com/material/1/‘ type='richtext'>
<name>foo</name>
<description>bar</description>
<ns1:price xmlns:ns1='http://predic8.com/common/1/'>
<amount>00.00</amount>
<currency>USD</currency>
</ns1:price>
<ns1:id xmlns:ns1='http://predic8.com/material/1/'>1</ns1:id>
</ns1:article>
<ns1:article xmlns:ns1='http://predic8.com/material/2/‘ type='richtext'>
<name>some name</name>
<description>some description</description>
<ns1:price xmlns:ns1='http://predic8.com/common/2/'>
<amount>00.01</amount>
<currency>USD</currency>
</ns1:price>
<ns1:id xmlns:ns1='http://predic8.com/material/2/'>2</ns1:id>
</ns1:article>
</ns1:create>
PS: This is a more detailed question to a previous question asked, although due to a series of further requirements and clarifications, I was recommended to ask a more broader question in order to address my needs.
Note also, I am implementing this in Java. So either a Java-based or XSLT-based solution would both be perfectly acceptable. thnx.
Further note: I am really looking for a generic solution. The XML shown above is just an example.
This problem has an easy solution if one builds upon the solution of the previous problem:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kNSFor" match="namespace" use="#of"/>
<xsl:variable name="vStylesheet" select="document('')"/>
<xsl:variable name="vPop" as="element()*">
<item path="/create/article/#type">richtext</item>
<item path="/create/article/#lang">en-us</item>
<item path="/create/article[1]/id">1</item>
<item path="/create/article[1]/description">bar</item>
<item path="/create/article[1]/name[1]">foo</item>
<item path="/create/article[1]/price[1]/amount">00.00</item>
<item path="/create/article[1]/price[1]/currency">USD</item>
<item path="/create/article[1]/price[2]/amount">11.11</item>
<item path="/create/article[1]/price[2]/currency">AUD</item>
<item path="/create/article[2]/id">2</item>
<item path="/create/article[2]/description">some name</item>
<item path="/create/article[2]/name[1]">some description</item>
<item path="/create/article[2]/price[1]/amount">00.01</item>
<item path="/create/article[2]/price[1]/currency">USD</item>
<namespace of="create" prefix="ns1:"
url="http://predic8.com/wsdl/material/ArticleService/1/"/>
<namespace of="article" prefix="ns1:"
url="xmlns:ns1='http://predic8.com/material/1/"/>
<namespace of="#lang" prefix="xml:"
url="http://www.w3.org/XML/1998/namespace"/>
<namespace of="price" prefix="ns1:"
url="xmlns:ns1='http://predic8.com/material/1/"/>
<namespace of="id" prefix="ns1:"
url="xmlns:ns1='http://predic8.com/material/1/"/>
</xsl:variable>
<xsl:template match="/">
<xsl:sequence select="my:subTree($vPop/#path/concat(.,'/',string(..)))"/>
</xsl:template>
<xsl:function name="my:subTree" as="node()*">
<xsl:param name="pPaths" as="xs:string*"/>
<xsl:for-each-group select="$pPaths" group-adjacent=
"substring-before(substring-after(concat(., '/'), '/'), '/')">
<xsl:if test="current-grouping-key()">
<xsl:choose>
<xsl:when test=
"substring-after(current-group()[1], current-grouping-key())">
<xsl:variable name="vLocal-name" select=
"substring-before(concat(current-grouping-key(), '['), '[')"/>
<xsl:variable name="vNamespace"
select="key('kNSFor', $vLocal-name, $vStylesheet)"/>
<xsl:choose>
<xsl:when test="starts-with($vLocal-name, '#')">
<xsl:attribute name=
"{$vNamespace/#prefix}{substring($vLocal-name,2)}"
namespace="{$vNamespace/#url}">
<xsl:value-of select=
"substring(
substring-after(current-group(), current-grouping-key()),
2
)"/>
</xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:element name="{$vNamespace/#prefix}{$vLocal-name}"
namespace="{$vNamespace/#url}">
<xsl:sequence select=
"my:subTree(for $s in current-group()
return
concat('/',substring-after(substring($s, 2),'/'))
)
"/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="current-grouping-key()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:for-each-group>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used), the wanted, correct result is produced:
<ns1:create xmlns:ns1="http://predic8.com/wsdl/material/ArticleService/1/">
<ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/" type="richtext"
xml:lang="en-us"/>
<ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/">
<ns1:id>1</ns1:id>
<description>bar</description>
<name>foo</name>
<ns1:price>
<amount>00.00</amount>
<currency>USD</currency>
</ns1:price>
<ns1:price>
<amount>11.11</amount>
<currency>AUD</currency>
</ns1:price>
</ns1:article>
<ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/">
<ns1:id>2</ns1:id>
<description>some name</description>
<name>some description</name>
<ns1:price>
<amount>00.01</amount>
<currency>USD</currency>
</ns1:price>
</ns1:article>
</ns1:create>
Explanation:
A reasonable assumption is made that throughout the generated document any two elements with the same local-name() belong to the same namespace -- this covers the predominant majority of real-world XML documents.
The namespace specifications follow the path specifications. A nsmespace specification has the form: <namespace of="target element's local-name" prefix="wanted prefix" url="namespace-uri"/>
Before generating an element with xsl:element, the appropriate namespace specification is selected using an index created by an xsl:key. From this namespace specification the values of its prefix and url attributes are used in specifying in the xsl:element instruction the values of the full element name and the element's namespace-uri.
Interesting question. Let's assume that your input set of XPath expressions satisfies some reasonsable constraints, for example if there is an X/article[2] then there also (preceding it) an X/article[1]. And let's put the namespace part of the problem to one side for the moment.
Let's go for an XSLT 2.0 solution: we'll start with the input in the form
<paths>
<path value="1">/create/article[1]/id</path>
<path value="bar">/create/article[1]/description</path>
</paths>
and then we'll turn this into
<paths>
<path value="1"><step>create</step><step>article[1]</step><step>id</step></path>
...
</paths>
Now we'll call a function which does a grouping on the first step, and calls itself recursively to do grouping on the next step:
<xsl:function name="f:group">
<xsl:param name="paths" as="element(path)*"/>
<xsl:param name="step" as="xs:integer"/>
<xsl:for-each-group select="$paths" group-by="step[$step]">
<xsl:element name="{replace(current-grouping-key(), '\[.*', '')}">
<xsl:choose>
<xsl:when test="count(current-group) gt 1">
<xsl:sequence select="f:group(current-group(), $step+1)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="current-group()[1]/#value"/>
</xsl:otherwise>
</xsl:choose>
</xsl:element>
</xsl:for-each-group>
</xsl:function>
That's untested, and there may well be details you have to adjust to get it working. But I think the basic approach should work.
The namespace part of the problem is perhaps best tackled by preprocessing the list of paths to add a namespace attribute to each step element; this can then be used in the xsl:element instruction to put the element in the right namespace.
i came across a similar situation where i had to convert Set of XPath/FQN - value mappings to XML. A generic simple solution can be using the following code, which can be enhanced to specific requirements.
public class XMLUtils {
static public String transformToXML(Map<String, String> pathValueMap, String delimiter)
throws ParserConfigurationException, TransformerException {
DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
Element rootElement = null;
Iterator<Entry<String, String>> it = pathValueMap.entrySet().iterator();
while (it.hasNext()) {
Entry<String, String> pair = it.next();
if (pair.getKey() != null && pair.getKey() != "" && rootElement == null) {
String[] pathValuesplit = pair.getKey().split(delimiter);
rootElement = document.createElement(pathValuesplit[0]);
break;
}
}
document.appendChild(rootElement);
Element rootNode = rootElement;
Iterator<Entry<String, String>> iterator = pathValueMap.entrySet().iterator();
while (iterator.hasNext()) {
Entry<String, String> pair = iterator.next();
if (pair.getKey() != null && pair.getKey() != "" && rootElement != null) {
String[] pathValuesplit = pair.getKey().split(delimiter);
if (pathValuesplit[0].equals(rootElement.getNodeName())) {
int i = pathValuesplit.length;
Element parentNode = rootNode;
int j = 1;
while (j < i) {
Element child = null;
NodeList childNodes = parentNode.getChildNodes();
for (int k = 0; k < childNodes.getLength(); k++) {
if (childNodes.item(k).getNodeName().equals(pathValuesplit[j])
&& childNodes.item(k) instanceof Element) {
child = (Element) childNodes.item(k);
break;
}
}
if (child == null) {
child = document.createElement(pathValuesplit[j]);
if (j == (i - 1)) {
child.appendChild(
document.createTextNode(pair.getValue() == null ? "" : pair.getValue()));
}
}
parentNode.appendChild(child);
parentNode = child;
j++;
}
} else {
// ignore any other root - add logger
System.out.println("Data not processed for node: " + pair.getKey());
}
}
}
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource domSource = new DOMSource(document);
// to return a XMLstring in response to an API
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
StreamResult resultToFile = new StreamResult(new File("C:/EclipseProgramOutputs/GeneratedXMLFromPathValue.xml"));
transformer.transform(domSource, resultToFile);
transformer.transform(domSource, result);
return writer.toString();
}
public static void main(String args[])
{
Map<String, String> pathValueMap = new HashMap<String, String>();
String delimiter = "/";
pathValueMap.put("create/article__1/id", "1");
pathValueMap.put("create/article__1/description", "something");
pathValueMap.put("create/article__1/name", "Book Name");
pathValueMap.put("create/article__1/price/amount", "120" );
pathValueMap.put("create/article__1/price/currency", "INR");
pathValueMap.put("create/article__2/id", "2");
pathValueMap.put("create/article__2/description", "something else");
pathValueMap.put("create/article__2/name", "Book name 1");
pathValueMap.put("create/article__2/price/amount", "2100");
pathValueMap.put("create/article__2/price/currency", "USD");
try {
XMLUtils.transformToXML(pathValueMap, delimiter);
} catch (ParserConfigurationException | TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}}
Output:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<create>
<article__1>
<id>1</id>
<name>Book Name</name>
<description>something</description>
<price>
<currency>INR</currency>
<amount>120</amount>
</price>
</article__1>
<article__2>
<description>something else</description>
<name>Book name 1</name>
<id>2</id>
<price>
<currency>USD</currency>
<amount>2100</amount>
</price>
</article__2>
To remove __%num , can use regular expressions on final string. like:
resultString = resultString.replaceAll("(__[0-9][0-9])|(__[0-9])", "");
This would do the cleaning job
Given xml like this:
<container>
<item>
<xmlText>
<someTag>
<otherTag>
Text
</otherTag>
</someTag>
</xmlText>
</item>
<container>
I would like to select all text that is under item/xmlText. I would like to print all the content of this node with tags (someTag, otherTag).
I would prefer to handle with this with XPath, but this is part of Java program, so if there is such mechanism I could take it as well.
Use XSLT for this:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select="/container/item/xmlText/node()"/>
</xsl:template>
</xsl:stylesheet>
When this is applied on the provided XML document (corrected to be well-formed !!!):
<container>
<item>
<xmlText>
<someTag>
<otherTag>
Text
</otherTag>
</someTag>
</xmlText>
</item>
</container>
the wanted, correct result is produced:
<someTag>
<otherTag>
Text
</otherTag>
</someTag>
When this is your Element retrieved with XPath
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
Element element = (Element) xpath.evaluate(
"/container/item/xmlText", document, XPathConstants.NODE);
Then, you can do something along these lines:
java.io.ByteArrayOutputStream data =
new java.io.ByteArrayOutputStream();
java.io.PrintStream ps = new java.io.PrintStream(data);
// These classes are part of Xerces. But you will find them in your JDK,
// as well, in a different package. Use any encoding here:
org.apache.xml.serialize.OutputFormat of =
new org.apache.xml.serialize.OutputFormat("XML", "ISO-8859-1", true);
org.apache.xml.serialize.XMLSerializer serializer =
new org.apache.xml.serialize.XMLSerializer(ps, of);
// Here, serialize the element that you obtained using your XPath expression.
serializer.asDOMSerializer();
serializer.serialize(element);
// The output stream now holds serialized XML data, including tags/attributes...
return data.toString();
UPDATE
This would be more concise, rather than using Xerces internals. It's the same as Dimitre's solution, just not with an XSLT stylesheet but all in Java:
ByteArrayOutputStream out = new ByteArrayOutputStream();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
Source source = new DOMSource(element);
Result target = new StreamResult(out);
transformer.transform(source, target);
return out.toString();
I have the following code
DocumentBuilderFactory dbFactory_ = DocumentBuilderFactory.newInstance();
Document doc_;
DocumentBuilder dBuilder = dbFactory_.newDocumentBuilder();
StringReader reader = new StringReader(s);
InputSource inputSource = new InputSource(reader);
doc_ = dBuilder.parse(inputSource);
doc_.getDocumentElement().normalize();
Then I can do
doc_.getDocumentElement();
and get my first element but the problem is instead of being job the element is tns:job.
I know about and have tried to use:
dbFactory_.setNamespaceAware(true);
but that is just not what I'm looking for, I need something to completely get rid of namespaces.
Any help would be appreciated,
Thanks,
Josh
Use the Regex function. This will solve this issue:
public static String removeXmlStringNamespaceAndPreamble(String xmlString) {
return xmlString.replaceAll("(<\\?[^<]*\\?>)?", ""). /* remove preamble */
replaceAll("xmlns.*?(\"|\').*?(\"|\')", "") /* remove xmlns declaration */
.replaceAll("(<)(\\w+:)(.*?>)", "$1$3") /* remove opening tag prefix */
.replaceAll("(</)(\\w+:)(.*?>)", "$1$3"); /* remove closing tags prefix */
}
For Element and Attribute nodes:
Node node = ...;
String name = node.getLocalName();
will give you the local part of the node's name.
See Node.getLocalName()
You can pre-process XML to remove all namespaces, if you absolutely must do so. I'd recommend against it, as removing namespaces from an XML document is in essence comparable to removing namespaces from a programming framework or library - you risk name clashes and lose the ability to differentiate between once-distinct elements. However, it's your funeral. ;-)
This XSLT transformation removes all namespaces from any XML document.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Apply it to your XML document. Java examples for doing such a thing should be plenty, even on this site. The resulting document will be exactly of the same structure and layout, just without namespaces.
Rather than
dbFactory_.setNamespaceAware(true);
Use
dbFactory_.setNamespaceAware(false);
Although I agree with Tomalak: in general, namespaces are more helpful than harmful. Why don't you want to use them?
Edit: this answer doesn't answer the OP's question, which was how to get rid of namespace prefixes. RD01 provided the correct answer to that.
Tomalak, one fix of your XSLT (in 3rd template):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node() | #*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<!-- Here! -->
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
public static void wipeRootNamespaces(Document xml) {
Node root = xml.getDocumentElement();
NodeList rootchildren = root.getChildNodes();
Element newroot = xml.createElement(root.getNodeName());
for (int i=0;i<rootchildren.getLength();i++) {
newroot.appendChild(rootchildren.item(i).cloneNode(true));
}
xml.replaceChild(newroot, root);
}
The size of the input xml also needs to be considered when choosing the solution. For large xmls, in the size of ~100k, possible if your input is from a web service, you also need to consider the garbage collection implications when you manipulate a large string. We used String.replaceAll before, and it caused frequent OOM in production with a 1.5G heap size because of the way replaceAll is implemented.
You can reference http://app-inf.blogspot.com/2013/04/pitfalls-of-handling-large-string.html for our findings.
I am not sure how XSLT deals with large String objects, but we ended up parsing the string manualy to remove prefixes in one parse to avoid creating additional large java objects.
public static String removePrefixes(String input1) {
String ret = null;
int strStart = 0;
boolean finished = false;
if (input1 != null) {
//BE CAREFUL : allocate enough size for StringBuffer to avoid expansion
StringBuffer sb = new StringBuffer(input1.length());
while (!finished) {
int start = input1.indexOf('<', strStart);
int end = input1.indexOf('>', strStart);
if (start != -1 && end != -1) {
// Appending anything before '<', including '<'
sb.append(input1, strStart, start + 1);
String tag = input1.substring(start + 1, end);
if (tag.charAt(0) == '/') {
// Appending '/' if it is "</"
sb.append('/');
tag = tag.substring(1);
}
int colon = tag.indexOf(':');
int space = tag.indexOf(' ');
if (colon != -1 && (space == -1 || colon < space)) {
tag = tag.substring(colon + 1);
}
// Appending tag with prefix removed, and ">"
sb.append(tag).append('>');
strStart = end + 1;
} else {
finished = true;
}
}
//BE CAREFUL : use new String(sb) instead of sb.toString for large Strings
ret = new String(sb);
}
return ret;
}
Instead of using TransformerFactory and then calling transform on it (which was injecting the empty namespace, I transformed as follows:
OutputStream outputStream = new FileOutputStream(new File(xMLFilePath));
OutputFormat outputFormat = new OutputFormat(doc, "UTF-8", true);
outputFormat.setOmitComments(true);
outputFormat.setLineWidth(0);
XMLSerializer serializer = new XMLSerializer(outputStream, outputFormat);
serializer.serialize(doc);
outputStream.close();
I also faced the namespace issue and was unable to read XML file in java. below is the solution:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);// this is imp code that will deactivate namespace in xml
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("XML/"+ fileName);