Parsing an XML document to get node values

Parsing an XML document to get node values - java

I have an xml structure as below:
String attributesXML="<entry>
<value>
<List>
<String>Rob</String>
<String>Mark</String>
<String>Peter</String>
<String>John</String>
</List>
</value>
</entry>"
I want to fetch the values Rob,Mark,Peter,John. I can get the nodes starting from entry node(Code below). Problem is i don't know what will be the child node names under entry node. Starting from entry node i need to keep drilling down until I find the values. I have written a method getChildNodeValue() but it doesn't give me the required Output. It does print what i need but it prints some extra stuff as well. I need to return the values as a csv from this method getChildNodeValue().
Getting Entry Node:
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(attributesXML));
Document doc = db.parse(is);
NodeList nodes = doc.getElementsByTagName("entry");
for (int i = 0; i < nodes.getLength(); i++) {
if(nodes.item(i).hasChildNodes()){
getChildNodeValue(nodes.item(i));
}
}
public static void getChildNodeValue(Node node) {
System.out.println("Start Node: "+node.getNodeName());
NodeList nodeList = node.getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++) {
Node currentNode = nodeList.item(i);
while(currentNode.hasChildNodes()){
System.out.println("Current Node: "+currentNode.getNodeName());
nodeList = currentNode.getChildNodes();
for(int j=0;j<nodeList.getLength();j++){
currentNode = nodeList.item(j);
System.out.println("Node name: "+currentNode.getNodeName());
System.out.println("Node value: "+currentNode.getTextContent());
}
}
}
}

you can simply use XStream library for xml parsing it will parse java object to xml and vice versa.
check out below link
http://x-stream.github.io/tutorial.html

Related

Java, XPath Expression to read all node names, node values, and attributes

I need help in make an xpath expression to read all node names, node values, and attributes in an xml string. I made this:
private List<String> listOne = new ArrayList<String>();
private List<String> listTwo = new ArrayList<String>();
public void read(String xml) {
try {
// Turn String into a Document
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
// Setup XPath to retrieve all tags and values
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(node.getNodeName());
listTwo.add(node.getNodeValue());
// Another list to hold attributes
}
} catch(Exception e) {
LogHandle.info(e.getMessage());
}
}
I found the expression //text()[normalize-space()=''] online; however, it doesn't work. When I get try to get the node name from listOne, it is just #text. I tried //, but that doesn't work either. If I had this XML:
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
listOne[0] should hold Data, listOne[1] should hold Test, listTwo[1] should hold blah, etc... All the attributes will be saved in another parallel list.
What expression should xPath evaluate?
Note: The XML String can have different tags, so I can't hard code anything.
Update: Tried this loop:
NodeList nodeList = (NodeList) xPath.evaluate("//*", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(i, node.getNodeName());
// If null then must be text node
if(node.getChildNodes() == null)
listTwo.add(i, node.getTextContent());
}
However, this only gets the root element Data, then just stops.

//* will select all element nodes, //#* all attribute nodes. However, an element node does not have a meaningful node value in the DOM, so you would need to read out getTextContent() instead of getNodeValue.
As you seem to consider an element with child elements to have a "null" value I think you need to check whether there are any child elements:
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse("sampleInput1.xml");
XPathFactory fact = XPathFactory.newInstance();
XPath xpath = fact.newXPath();
NodeList allElements = (NodeList)xpath.evaluate("//*", doc, XPathConstants.NODESET);
ArrayList<String> elementNames = new ArrayList<>();
ArrayList<String> elementValues = new ArrayList<>();
for (int i = 0; i < allElements.getLength(); i++)
{
Node currentElement = allElements.item(i);
elementNames.add(i, currentElement.getLocalName());
elementValues.add(i, xpath.evaluate("*", currentElement, XPathConstants.NODE) != null ? null : currentElement.getTextContent());
}
for (int i = 0; i < elementNames.size(); i++)
{
System.out.println("Name: " + elementNames.get(i) + "; value: " + (elementValues.get(i)));
}
For the sample input
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
the output is
Name: Data; value: null
Name: Test; value: blah
Name: Foo; value: bar
Name: Date; value: 12242016
Name: Phone; value: null
Name: Home; value: 5555555555
Name: Mobile; value: 5555556789

DOM parsing in Java not able to get the nested notes

I have to parse an xml file in which I have many name value pairs.
I have to update the value in case it matches a given name.
I opted for DOM parsing as it can easily traverse any part and can quickly update the value.
It is however giving me some wired results when I am running it on my sample file.
I am new to DOM so if someone can help it can solve my problem.
I tried various things but all resulting in either null values for content or #text node name.
I am not able to get the text content of the tag.
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(xmlFilePath);
//This will get the first NVPair
Node NVPairs = document.getElementsByTagName("NVPairs").item(0);
//This should assign nodes with all the child nodes of NVPairs. This should be ideally
//<nameValuePair>
NodeList nodes = NVPairs.getChildNodes();
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
// I think it will consider both starting and closing tag as node so checking for if it has
//child
if(node.hasChildNodes())
{
//This should give me the content in the name tag.
//However this is not happening
if ("Tom".equals(node.getFirstChild().getTextContent())) {
node.getLastChild().setTextContent("2000000");
}
}
}
Sample xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?><application>
<NVPairs>
<nameValuePair>
<name>Tom</name>
<value>12</value>
</nameValuePair>
<nameValuePair>
<name>Sam</name>
<value>121</value>
</nameValuePair>
</NVPairs>

#getChildNodes() and #getFirstChild() returns all kinds of nodes, not just Element nodes, and in this case the first child of <name>Tom</name> is a Text node (with newline and blanks). So your test will never return true.
However, in cases like this, it always much more convenient to use XPath:
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate(
"//nameValuePair/value[preceding-sibling::name = 'Tom']", document,
XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
node.setTextContent("2000000");
}
I.e., return all <name> elements that has a preceding sibling element <name> with value 'Tom'.

Android/Java XML Parsing with nodes of same name

I need some advice on how to parse XML with Java where there are multiple nodes that have the same tag. For example, if I have an XML file that looks like this:
<?xml version="1.0"?>
<TrackResponse>
<TrackInfo ID="EJ958083578US">
<TrackSummary>Your item was delivered at 8:10 am on June 1 in Wilmington DE 19801.</TrackSummary>
<TrackDetail>May 30 11:07 am NOTICE LEFT WILMINGTON DE 19801.</TrackDetail>
<TrackDetail>May 30 10:08 am ARRIVAL AT UNIT WILMINGTON DE 19850.</TrackDetail>
<TrackDetail>May 29 9:55 am ACCEPT OR PICKUP EDGEWATER NJ 07020.</TrackDetail>
</TrackInfo>
</TrackResponse>
I am able to get the "TrackSummary" but I do not know how to handle the "TrackDetail", since there is more than 1. There could be more than the 3 on that sample XML so I need a way to handle that.
So far I have this code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xmlResponse));
Document dom = builder.parse(is);
//Get the ROOT: "TrackResponse"
Element docEle = dom.getDocumentElement();
//Get the CHILD: "TrackInfo"
NodeList nl = docEle.getElementsByTagName("TrackInfo");
String summary = "";
//Make sure we found the child node okay
if (nl != null && nl.getLength() > 0)
{
//In the event that there is more then one node, loop
for (int i = 0 ; i < nl.getLength(); i++)
{
summary = getTextValue(docEle,"TrackSummary");
Log.d("SUMMARY", summary);
}
return summary;
}
How would I handle the whole 'multiple TrackDetail nodes' ordeal? I'm new to XML parsing so I am a bit unfamiliar on how to tackle things like this.

You can try like this :
public Map getValue(Element element, String str) {
NodeList n = element.getElementsByTagName(str);
for (int i = 0; i < n.getLength(); i++) {
System.out.println(getElementValue(n.item(i)));
}
return list/MapHere;
}

If you are free to change your implementation then i would suggest you to use implementation given here.
you can collect the trackdetail in string array and when you are in XmlPullParser.END_TAG check for trackinfo tag end and then stop

You can refer below code for that.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(f);
Element root = doc.getDocumentElement();
NodeList nodeList = doc.getElementsByTagName("TrackInfo");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i); // this is node under track info
// do your stuff
}
for more information you can go through below link.
How to parse same name tag in xml using dom parser java?
It may help.

Parse XML node by node and check for leaf node

I used XPath expression //*[count(./*) = 0] to find the leaf nodes in an XML. But instead of using the expression, I wanted to parse the XML, node by node and check if it is a leaf node or not. How can I accomplish this? My XML is a dynamic one.

Using the following java code you can parse the xml and use docEle.hasChildNodes() to check it a leaf node or not.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse("file.xml");
Element docEle = dom.getDocumentElement();
NodeList nl = docEle.getChildNodes();
if (nl != null && nl.getLength() > 0) {
for (int i = 0; i < nl.getLength(); i++) {
if (nl.item(i).getNodeType() == Node.ELEMENT_NODE) {
Element el = (Element) nl.item(i);
el.getTextContent().trim();
}
}
}
}

Child elements of DOM

I have this XML file:
<scene>
<texture file="file1.dds"/>
<texture file="file2.dds"/>
...
<node name="cube">
<texture name="stone" unit="0" sampler="anisotropic"/>
</node>
</scene>
I need all child element of 'scene' that are named "texture", but with this code:
Element rootNode = document.getDocumentElement();
NodeList childNodes = rootNode.getElementsByTagName("texture");
for (int nodeIx = 0; nodeIx < childNodes.getLength(); nodeIx++) {
Node node = childNodes.item(nodeIx);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// cool stuff here
}
}
i also get the 'texture' elements which are inside 'node'.
How can i filter these out? Or how can i get only the elements that are direct childs of 'scene'?

You can do it using Xpath, consider the following example taken from the JAXP Specification 1.4 (which I recommend you to consult for this):
// parse the XML as a W3C Document
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.Document document = builder.parse(new File("/widgets.xml"));
// evaluate the XPath expression against the Document
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "/widgets/widget[#name='a']/#quantity";
Double quantity = (Double) xpath.evaluate(expression, document, XPathConstants.NUMBER);

I found myself a solution that works fine:
Element parent = ... ;
String childName = "texture";
NodeList childs = parent.getChildNodes();
for (int nodeIx = 0; nodeIx < childs.getLength(); nodeIx++) {
Node node = childs.item(nodeIx);
if (node.getNodeType() == Node.ELEMENT_NODE
&& node.getNodeName().equals(name)) {
// cool stuff here
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing an XML document to get node values - java

you can simply use XStream library for xml parsing it will parse java object to xml and vice versa. check out below link http://x-stream.github.io/tutorial.html

Related

Java, XPath Expression to read all node names, node values, and attributes

DOM parsing in Java not able to get the nested notes

Android/Java XML Parsing with nodes of same name

Parse XML node by node and check for leaf node

Child elements of DOM

Categories

Resources