XPath Java count of child nodes - java

I want to count some child nodes of a given xml. But it always returns me 0 and I can't figure out why.
Here's the xml:
<FirstOne xmlns:xxx="http://www.w3.org/2001/XMLSchema-instance">
<Formulas xmlns:d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
<xxx:yyy>
<aa:bb>something</aa:bb>
<cc:dd>something</cc:dd>
</xxx:yyy>
<xxx:yyy>
<aa:bb>something</aa:bb>
<cc:dd>something</cc:dd>
</xxx:yyy>
<xxx:yyy>
<aa:bb>something</aa:bb>
<cc:dd>something</cc:dd>
</xxx:yyy>
</Formulas>
</FirstOne>
I want to count the number of "xxx:yyy". In this example 3.
I tried the following:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new FileInputStream(new File(fileArray[i].toString())));
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
String expression;
expression = "count(//Formulas/xxx:yyy)";
Double result = (Double) xpath.evaluate(expression, doc, XPathConstants.NUMBER);
It always gives me 0.0 ...
Thanks for your help!

The problems all stem from the namespaces.
Firstly, XPath evaluation is only defined over namespace-well-formed XML, so you need to ensure that the aa and cc prefixes are properly mapped to namespace URIs in the XML.
Secondly, you need to parse the XML into a DOM tree using a namespace-aware parser (for what I can only assume are historical reasons, DocumentBuilderFactory is not namespace-aware by default).
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new FileInputStream(new File(fileArray[i].toString())));
Now you have a proper namespace-well-formed DOM tree you need to handle the namespaces correctly in the XPath. You need to define a NamespaceContext telling the XPath how to relate prefixes and namespace URIs. Annoyingly there's no default implementation of this interface available in the core Java libraries but there are third-party implementations such as Spring's SimpleNamespaceContext, or it's only three methods to implement it yourself. With a SimpleNamespaceContext:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
xpath.setNamespaceContext(nsCtx);
nsCtx.bindNamespaceUri("x", "http://www.w3.org/2001/XMLSchema-instance");
With this context in place you can now select namespaced nodes in your XPath expression:
String expression = "count(//Formulas/x:yyy)";
(the prefixes you use are the ones in the NamespaceContext, not necessarily the ones in the original XML source).
While some DOM parsers and XPath implementations might let you get away with parsing non-namespace-aware and omitting the prefixes in the XPath expressions, this is an implementation detail and the behaviour is not defined by the specifications. It might work in one version but fail in another, or behave differently if you add additional JARs to your project that change the default parser, etc.

While xxx is the tag prefix, use just count(//Formulas/yyy).

Related

How to avoid xmlns="" to be added to a manipulated XML root element?

I'm changing the jta-data-source value of a persistence.xml as follows:
JavaArchive jarArchive = Maven.configureResolver().workOffline().resolve("richtercloud:project1-jar:jar:1.0-SNAPSHOT").withoutTransitivity().asSingle(JavaArchive.class);
Node persistenceXml = jarArchive.get("META-INF/persistence.xml");
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document persistenceXmlDocument = documentBuilder.parse(persistenceXml.getAsset().openStream());
//asked
//https://stackoverflow.com/questions/46771622/how-to-create-a-shrinkwrap-persistencedescriptor-from-an-existing-persistence-xm
//for how to manipulate persistence.xml more easily with
//ShrinkWrap's PersistenceDescriptor
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//persistence-unit/jta-data-source");
org.w3c.dom.Node persistenceXmlDataSourceNode = (org.w3c.dom.Node) expr.evaluate(persistenceXmlDocument,
XPathConstants.NODE);
persistenceXmlDataSourceNode.setTextContent("jdbc/project1-test-db");
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
//transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
//was there before, but unclear why
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(persistenceXmlDocument), new StreamResult(writer));
String persistenceUnit = writer.toString();
(since How to create a ShrinkWrap PersistenceDescriptor from an existing persistence.xml? has not been answered, yet).
That works fine, except for a xmlns="" attribute added to the persistence-unit under the root persistence element which seems to cause:
java.io.IOException: org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 108; Deployment descriptor file META-INF/persistence.xml in archive [project1-jar-1.0-SNAPSHOT.jar]. cvc-complex-type.2.4.a: Invalid content was found starting with element 'persistence-unit'. One of '{"http://xmlns.jcp.org/xml/ns/persistence":persistence-unit}' is expected.
I'm not adhering to the idea to use Transformer and related classes.
No idea why I can't reproduce this in Java SE, but the problem is that javax.xml.parsers.DocumentBuilder by default isn't namespace aware so that the namespace information gets lost during manipulation of the document and consequently an empty xmlns is added by Transformer.
Now that DocumentBuilder is namespace aware, XPath resolution doesn't work and queries return null, see XPath returning null for "Node" when isNameSpaceAware and isValidating are "true" for a detailed description and more details on the XPath namespace awareness issue (I couldn't get the solution to build a custom NamespaceContext to work).
In order to avoid this I finally adjusted my XPath query to use the local-name function as described at How to ignore namespace when selecting XML nodes with XPath. i.e. //*[local-name()='jta-data-source'].
It's still necessary to use Document.createElementNS instead of createElement in order to avoid empty xmlns attribute on newly created elements, see Empty default XML namespace xmlns="" attribute being added? for an explanation.

Finding Element in NodeList XML

Is there a way i can get the first Element from a NodeList? Im using org.w3c.dom to handle XML files, i have already written large parts of my program using org.w3c.dom and discovered only recently dom4j which has a method for it but i cannot use it because of backward compatibility issues with my other methods.
It is critical that i can find and pass the very first Element in my XML and it must be of type org.w3c.dom.Element , however, not even using doc.normalize(); has helped, neither did using dom4j methods to find the element and cast it into org.w3c.dom.Element as thats forbidden.
File file = new File("myXML.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dbuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.newDocument();
doc = dBuilder.parse(file);
doc.normalize;
Element sourceElem = doc.getDocumentElement();
NodeList nodelist = sourceElem.getChildNodes();
Element elem;
if(nodeList.item(0).getNodeType() == Node.ELEMENT_NODE){
elem = (Element)nodeList.item(0);
}
Im getting NullPointerException from other methods because it cant find the element.
I need it to work in the same way this C++ code does:
XmlDocument doc = new XmlDocument();
doc.Load("myXML.xml");
XmlElement elem = (XmlElement)doc.DocumentElement.ChildNodes[0];
EDIT: OR is there a way i can cast dom4j Element back into org.w3c.dom.Element?
EDIT2: Sample XML i need to access http://pastebin.com/C3nvxhwx

java DOM lookupNamespaceURI is not able to locate namespace URI

I'm trying to follow http://www.ibm.com/developerworks/xml/library/x-nmspccontext/index.html
UniversalNamespaceResolver
example for resolving namespaces of the XPath evaluation agains an XML. The problem I encountered is that lookupNamespaceURI call below returns null on the XML, I given below:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse(new InputSource(new StringReader(xml)));
String nsURI = dDoc.lookupNamespaceURI("h");
the XML:
<?xml version="1.0"?>
<h:root xmlns:h="http://www.w3.org/TR/html4/">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>`
</h:root>
while I'd expect it to return "http://www.w3.org/TR/html4/".
When configuring a DocumentBuilder, you have to explicitly make it namespace aware (a silly relic from the first days of xml when there were no namespaces):
domFactory.setNamespaceAware(true);
As a side note, the advice in that article is not very good. it fundamentally misses the point that you don't care what the namespace prefixes are in the actual document, they are irrelevant. you need the xpath namespace resolver to match the xpath expressions that you are using, and that is all. if you do what they are suggesting, you will have to change your xpath code whenever the document's prefixes change, which is a horrible idea.
Note, they sort of cede this point in their last bullet, but the rest of the article seems to miss that this is the fundamental idea when using xpath.
But if you don't have control over the XML file, and someone can send you any prefixes they wish, it might be better to be independent of their choices. You can code your own namespace resolution as in Example 1 (HardcodedNamespaceResolver), and use them in your XPath expressions.

How to append xml nodes (as a string) into an existing XML Element node (only using java builtins)?

(Disclaimer: using Rhino inside RingoJS)
Let's say I have a document with an element , I don't see how I can append nodes as string to this element. In order to parse the string to xml nodes and then append them to the node, I tried to use documentFragment but I couldn't get anywhere. In short, I need something as easy as .NET's .innerXML but it's not in the java api.
var dbFactory = javax.xml.parsers.DocumentBuilderFactory.newInstance();
var dBuilder = dbFactory.newDocumentBuilder();
var doc = dBuilder.newDocument();
var el = doc.createElement('test');
var nodesToAppend = '<foo bar="1">Hi <baz>there</baz></foo>';
el.appendChild(???);
How can I do this without using any third party library ?
[EDIT] It's not obvious in the example but I'm not supposed to know the content of variable 'nodesToAppend'. So please, don't point me to tutorials about how to create elements in an xml document.
You can do this in java - you should be able to derive the Rhino equivalent:
DocumentBuilderFactory dbFactory = javax.xml.parsers.DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.newDocument();
Element el = doc.createElement('test');
doc.appendChild(el);
String xml = "<foo bar=\"1\">Hi <baz>there</baz></foo>";
Document doc2 = builder.parse(new ByteArrayInputStream(xml.getBytes()));
Node node = doc.importNode(doc2.getDocumentElement(), true);
el.appendChild(node);
Since doc and doc2 are two different Documents the trick is to import the node from one document to another, which is done with the importNode api above
I think your question is like this question and there is answer on it :
Java: How to read and write xml files?
OR see this link http://www.mkyong.com/java/how-to-create-xml-file-in-java-dom/

Java XML parser?

I'm currently converting a program I wrote in Visual Basic .NET (the 2005 variety) into Java. It used built-in XML methods to parse and generate the user's saved data, does Java have an equivalent feature built in or am I going to have to change file processing implementations? (I'd rather not, there's a lot of code I'd have to change.)
Yes, Java can parse XML. Here's an example that takes in a String that contains XML and builds a Document object out of it:
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
InputSource inputSource = new InputSource(new StringReader(xml));
Document document = documentBuilder.parse(inputSource);
You can then use the XPath API to query the dom. Here's a tutorial/writeup about it.
As far as serializing objects to XML, the official implementation is JAXB and it is part of Java since 1.6. Here's a simple example. It will let you serialize and deserialize to and from XML.
You can also create a DOM object manually and add nodes to it, but it's a little more tedious:
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
Element rootNode = document.createElement("root");
Element childNode = document.createElement("child");
childNode.setTextContent("I am a child node");
childNode.setAttribute("attr", "value");
rootNode.appendChild(childNode);
document.appendChild(rootNode);
I'm assuming that you mean that the properties/structure was generated through the classes/beans themselves? If so, then the answer is no [without an third party component]. I've used XStream before, and that is about the closest that I've gotten to .NET's XML Class serialization.

Categories

Resources