Java, XPath Expression to read all node names, node values, and attributes

Java, XPath Expression to read all node names, node values, and attributes - java

I need help in make an xpath expression to read all node names, node values, and attributes in an xml string. I made this:
private List<String> listOne = new ArrayList<String>();
private List<String> listTwo = new ArrayList<String>();
public void read(String xml) {
try {
// Turn String into a Document
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
// Setup XPath to retrieve all tags and values
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(node.getNodeName());
listTwo.add(node.getNodeValue());
// Another list to hold attributes
}
} catch(Exception e) {
LogHandle.info(e.getMessage());
}
}
I found the expression //text()[normalize-space()=''] online; however, it doesn't work. When I get try to get the node name from listOne, it is just #text. I tried //, but that doesn't work either. If I had this XML:
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
listOne[0] should hold Data, listOne[1] should hold Test, listTwo[1] should hold blah, etc... All the attributes will be saved in another parallel list.
What expression should xPath evaluate?
Note: The XML String can have different tags, so I can't hard code anything.
Update: Tried this loop:
NodeList nodeList = (NodeList) xPath.evaluate("//*", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(i, node.getNodeName());
// If null then must be text node
if(node.getChildNodes() == null)
listTwo.add(i, node.getTextContent());
}
However, this only gets the root element Data, then just stops.

//* will select all element nodes, //#* all attribute nodes. However, an element node does not have a meaningful node value in the DOM, so you would need to read out getTextContent() instead of getNodeValue.
As you seem to consider an element with child elements to have a "null" value I think you need to check whether there are any child elements:
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse("sampleInput1.xml");
XPathFactory fact = XPathFactory.newInstance();
XPath xpath = fact.newXPath();
NodeList allElements = (NodeList)xpath.evaluate("//*", doc, XPathConstants.NODESET);
ArrayList<String> elementNames = new ArrayList<>();
ArrayList<String> elementValues = new ArrayList<>();
for (int i = 0; i < allElements.getLength(); i++)
{
Node currentElement = allElements.item(i);
elementNames.add(i, currentElement.getLocalName());
elementValues.add(i, xpath.evaluate("*", currentElement, XPathConstants.NODE) != null ? null : currentElement.getTextContent());
}
for (int i = 0; i < elementNames.size(); i++)
{
System.out.println("Name: " + elementNames.get(i) + "; value: " + (elementValues.get(i)));
}
For the sample input
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
the output is
Name: Data; value: null
Name: Test; value: blah
Name: Foo; value: bar
Name: Date; value: 12242016
Name: Phone; value: null
Name: Home; value: 5555555555
Name: Mobile; value: 5555556789

Related

Parsing an XML document to get node values

I have an xml structure as below:
String attributesXML="<entry>
<value>
<List>
<String>Rob</String>
<String>Mark</String>
<String>Peter</String>
<String>John</String>
</List>
</value>
</entry>"
I want to fetch the values Rob,Mark,Peter,John. I can get the nodes starting from entry node(Code below). Problem is i don't know what will be the child node names under entry node. Starting from entry node i need to keep drilling down until I find the values. I have written a method getChildNodeValue() but it doesn't give me the required Output. It does print what i need but it prints some extra stuff as well. I need to return the values as a csv from this method getChildNodeValue().
Getting Entry Node:
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(attributesXML));
Document doc = db.parse(is);
NodeList nodes = doc.getElementsByTagName("entry");
for (int i = 0; i < nodes.getLength(); i++) {
if(nodes.item(i).hasChildNodes()){
getChildNodeValue(nodes.item(i));
}
}
public static void getChildNodeValue(Node node) {
System.out.println("Start Node: "+node.getNodeName());
NodeList nodeList = node.getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++) {
Node currentNode = nodeList.item(i);
while(currentNode.hasChildNodes()){
System.out.println("Current Node: "+currentNode.getNodeName());
nodeList = currentNode.getChildNodes();
for(int j=0;j<nodeList.getLength();j++){
currentNode = nodeList.item(j);
System.out.println("Node name: "+currentNode.getNodeName());
System.out.println("Node value: "+currentNode.getTextContent());
}
}
}
}

you can simply use XStream library for xml parsing it will parse java object to xml and vice versa.
check out below link
http://x-stream.github.io/tutorial.html

Java XML XPath Full XML

got a little problem. I have the following code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("result1.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//element");
String elements = (String) expr.evaluate(doc, XPathConstants.STRING);
What i get :
jcruz0#exblog.jp
Cheryl
Blake
195115
What i want:
<person>
<email>jcruz0#exblog.jp</email>
<firstname>Cheryl</firstname>
<lastname>Blake</lastname>
<number>195115</number>
</person>
So as you can see i want the full XML tree. Not just the NodeValue.
Maybe somebody knows the trick.
Thanks for any help.

You got the string value of the selected XML element because you specified XPathConstants.STRING to XPathExpression.evaluate().
Instead, specify a return type of XPathConstants.NODE if you know for sure that your XPath will select a single element,
String elements = (String) expr.evaluate(doc, XPathConstants.NODE);
or XPathConstants.NODESET for multiple elements, which you would then iterate over to process as necessary.

Something like this can be done.
XPathExpression expr = xpath.compile("/person");
NodeList elements = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < elements.getLength(); i++) {
// the person node
System.out.println(elements.item(i).getNodeName());
for (int x = 0; x < elements.item(i).getChildNodes().getLength(); x++) {
// the elements under person
if (elements.item(i).getChildNodes().item(x).getNodeType() == Node.ELEMENT_NODE) {
System.out.println("\t" + elements.item(i).getChildNodes().item(x).getNodeName() + " - " + elements.item(i).getChildNodes().item(x).getTextContent());
}
}
}
Output
person
email - jcruz0#exblog.jp
firstname - Cheryl
lastname - Blake
number - 195115
You can use the nodes to do what you want, or wrap them in < and > if you just want to print them.

Dom parsing xml problems

I have a simple .xml file and need to parse it. The file is the following:
<table name="agents">
<row name="agent" password="pass" login="agent" ext_uid="133"/>
</table>
I need to get values of name, password, login, ext_uid to create a DB record.
What I have done for this:
created an or.w3c.dom.Document:
public Document getDocument(String fileName){
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setValidating(false);
DocumentBuilder builder = f.newDocumentBuilder();
return builder.parse(new File(fileName));
}
next I'm trying to print values:
document = getDocument(fileName);
NodeList nodes = document.getChildNodes();
for (int i=0; i<nodes.getLength(); i++){
Node node = nodes.item(i);
if(node.getNodeType() == Node.ELEMENT_NODE){
NodeList listofNodes = node.getChildNodes();
for(int j=0; j<listofNodes.getLength(); j++){
if(node.getNodeType() == Node.ELEMENT_NODE){
Node childNode = listofNodes.item(j);
System.out.println(childNode.getNodeValue()+" " + childNode.getNodeName());
}
}
}
}
I use this because I'm trying to find out how to get values: childNode.getNodeValue()+" " + childNode.getNodeName()
but the result is the following:
#text
null row
#text
in the first and te third cases the NodeValue is empty and in the second case it is null, that means, I guess that there no NodeValue at all.
So my question is how to get values of name, password, login, ext_uid?

childNode.getNodeValue() is obviously null as its an empty tag. You have to look for attributes
Node childNode = listofNodes.item(j);
Element e = (Element)childNode;
String name = e.getAttribute("name");
String password= e.getAttribute("password");
String login= e.getAttribute("login");
String ext_uid= e.getAttribute("ext_uid");

The <row> element has no value, it only has attributes. If it had a value it would look more like <row>this would be the value returned from getNodeValue()</row>.
One way to get the data is to iterate the XML node attributes, for example:
NamedNodeMap attrs = childNode.getAttributes();
if (attrs != null) {
for (int k = 0; k < attrs.getLength(); k++) {
System.out.println("Attribute: "
+ attrs.item(k).getNodeName() + " = "
+ attrs.item(k).getNodeValue());
}
}
The output of your code is showing #text due to the carriage returns (\n characters) in the example XML file, which, according the specification, should be preserved. The null in the example output is the empty node value from the value-less <row> element.

Use XPath instead:
XPath xp = XPathFactory.newInstance().newXPath();
System.out.println(xp.evaluate("/table/row/#name", doc));
System.out.println(xp.evaluate("/table/row/#password", doc));
System.out.println(xp.evaluate("/table/row/#login", doc));
System.out.println(xp.evaluate("/table/row/#ext_uid", doc));

Parse xml without tagname

I have a xml file
<Response>
<StatusCode>0</StatusCode>
<StatusDetail>OK</StatusDetail>
<AccountInfo>
<element1>value</element1>
<element2>value</element2>
<element3>value</element2>
<elementN>value</elementN>
</AccountInfo>
</Response>
And I want parse my elements in AccountInfo, but I dont know elements tag names.
Now Im using and have this code for tests, but in future I will recieve more elemenets in AccountInfo and I dont know how many or there names
String name="";
String balance="";
Node accountInfo = document.getElementsByTagName("AccountInfo").item(0);
if (accountInfo.getNodeType() == Node.ELEMENT_NODE){
Element accountInfoElement = (Element) accountInfo;
name = accountInfoElement.getElementsByTagName("Name").item(0).getTextContent();
balance = accountInfoElement.getElementsByTagName("Balance").item(0).getTextContent();
}

Heres 2 ways you can do it:
Node accountInfo = document.getElementsByTagName("AccountInfo").item(0);
NodeList children = accountInfo.getChildNodes();
or you can do
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList children = (NodeList) xPath.evaluate("//AccountInfo/*", document.getDocumentElement(), XPathConstants.NODESET);
Once you have your NodeList you can loop through them.
for(int i=0;i<children.getLength();i++) {
if(children.item(i).getNodeType() == Node.ELEMENT_NODE) {
Element elem = (Element)children.item(i);
// If your document is namespace aware use localName
String localName = elem.getLocalName();
// Tag name returns the localName and the namespace prefix
String tagName= elem.getTagName();
// do stuff with the children
}
}

Child elements of DOM

I have this XML file:
<scene>
<texture file="file1.dds"/>
<texture file="file2.dds"/>
...
<node name="cube">
<texture name="stone" unit="0" sampler="anisotropic"/>
</node>
</scene>
I need all child element of 'scene' that are named "texture", but with this code:
Element rootNode = document.getDocumentElement();
NodeList childNodes = rootNode.getElementsByTagName("texture");
for (int nodeIx = 0; nodeIx < childNodes.getLength(); nodeIx++) {
Node node = childNodes.item(nodeIx);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// cool stuff here
}
}
i also get the 'texture' elements which are inside 'node'.
How can i filter these out? Or how can i get only the elements that are direct childs of 'scene'?

You can do it using Xpath, consider the following example taken from the JAXP Specification 1.4 (which I recommend you to consult for this):
// parse the XML as a W3C Document
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.Document document = builder.parse(new File("/widgets.xml"));
// evaluate the XPath expression against the Document
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "/widgets/widget[#name='a']/#quantity";
Double quantity = (Double) xpath.evaluate(expression, document, XPathConstants.NUMBER);

I found myself a solution that works fine:
Element parent = ... ;
String childName = "texture";
NodeList childs = parent.getChildNodes();
for (int nodeIx = 0; nodeIx < childs.getLength(); nodeIx++) {
Node node = childs.item(nodeIx);
if (node.getNodeType() == Node.ELEMENT_NODE
&& node.getNodeName().equals(name)) {
// cool stuff here
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java, XPath Expression to read all node names, node values, and attributes - java

Related

Parsing an XML document to get node values

Java XML XPath Full XML

Dom parsing xml problems

Parse xml without tagname

Child elements of DOM

Categories

Resources