DOM parsing in Java not able to get the nested notes

DOM parsing in Java not able to get the nested notes - java

I have to parse an xml file in which I have many name value pairs.
I have to update the value in case it matches a given name.
I opted for DOM parsing as it can easily traverse any part and can quickly update the value.
It is however giving me some wired results when I am running it on my sample file.
I am new to DOM so if someone can help it can solve my problem.
I tried various things but all resulting in either null values for content or #text node name.
I am not able to get the text content of the tag.
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(xmlFilePath);
//This will get the first NVPair
Node NVPairs = document.getElementsByTagName("NVPairs").item(0);
//This should assign nodes with all the child nodes of NVPairs. This should be ideally
//<nameValuePair>
NodeList nodes = NVPairs.getChildNodes();
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
// I think it will consider both starting and closing tag as node so checking for if it has
//child
if(node.hasChildNodes())
{
//This should give me the content in the name tag.
//However this is not happening
if ("Tom".equals(node.getFirstChild().getTextContent())) {
node.getLastChild().setTextContent("2000000");
}
}
}
Sample xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?><application>
<NVPairs>
<nameValuePair>
<name>Tom</name>
<value>12</value>
</nameValuePair>
<nameValuePair>
<name>Sam</name>
<value>121</value>
</nameValuePair>
</NVPairs>

#getChildNodes() and #getFirstChild() returns all kinds of nodes, not just Element nodes, and in this case the first child of <name>Tom</name> is a Text node (with newline and blanks). So your test will never return true.
However, in cases like this, it always much more convenient to use XPath:
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate(
"//nameValuePair/value[preceding-sibling::name = 'Tom']", document,
XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
node.setTextContent("2000000");
}
I.e., return all <name> elements that has a preceding sibling element <name> with value 'Tom'.

Related

Best way to reach the tag I want in an XML file when it's repeated?

first post in here. I have an XML file that includes the tag "usine" multiple times and I'm doing it in a way that does not seem right and I want to see if there's a more optimal way to do it. This is my first time working with XML and Node/NodeList so I'm still getting familiar with it.
Here is the XML file
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<metadonnees>
<usine type="usine-matiere">
<icones>
<icone type="vide" path="src/ressources/UMP0%.png"/>
<icone type="un-tiers" path="src/ressources/UMP33%.png"/>
<icone type="deux-tiers" path="src/ressources/UMP66%.png"/>
<icone type="plein" path="src/ressources/UMP100%.png"/>
</icones>
<sortie type = "metal"/>
<interval-production>100</interval-production>
</usine>
<usine type="usine-aile">
<icones>
<icone type="vide" path="src/ressources/UT0%.png"/>
<icone type="un-tiers" path="src/ressources/UT33%.png"/>
<icone type="deux-tiers" path="src/ressources/UT66%.png"/>
<icone type="plein" path="src/ressources/UT100%.png"/>
</icones>
<entree type="metal" quantite="2"/>
<sortie type="aile"/>
<interval-production>50</interval-production>
</usine>
</metadonnees>
<simulation>
<usine type="usine-matiere" id="11" x="32" y="32"/>
<usine type="usine-aile" id="21" x="320" y="32"/>
<chemins>
<chemin de="11" vers="21" />
<chemin de="21" vers="41" />
</chemins>
</simulation>
For example, if I want to retrieve the x value of 'usine type="usine-aile"' in the simulation tag, here is the code I use :
NodeList nList = doc.getElementsByTagName("simulation");
Node positionNode = nList.item(0);
Element elementPosition = (Element) positionNode;
NodeList cooList = elementPosition.getElementsByTagName("usine");
Node cooNode = cooList.item(0);
Element cooElem = (Element) cooNode;
System.out.println(cooElem.getAttribute("x"));
Basically I have to make two NodeLists because the I want is in the tag and not the one in the tag, so the first NodeList is to locate me in the tag, then I go deeper making a new NodeList to find the I want. Is there a better way to do this? I'm probably doing it a wrong way so I wish to know your answers. Thanks

First, Get elements by tag name (e.g get elements of usine), it returns a Nodelist of that tag. e.g in simulation you have two usine tag(a NodeList with lenght of 2).
Second, you can iterate this Nodelist and do whatever you want to each node (element), for example you can get the attribute of each usine tag(
x , y, id)
In summary
1- Get element by tag name (NodeList)
2- Iterate Nodelist
3- Process the nodes (e.g Get attribute of each node in the iteration process (x,y,id)
I coded your scenario as follows
public static void main(String argv[]) throws ParserConfigurationException, IOException, SAXException {
//Read xml file
File fXmlFile = new File("/test.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
//Get usine nodes
NodeList nodeList = doc.getElementsByTagName("usine");
//Iterate nodeList
for (int temp = 0; temp < nodeList.getLength(); temp++) {
//Get each node and process it
Node node = nodeList.item(temp);
if (node.getNodeType() == Node.ELEMENT_NODE) {
//Print attributes of the node
Element element = (Element) node;
System.out.println("X = " + element.getAttribute("x"));
System.out.println("Y = " + element.getAttribute("y"));
System.out.println("ID = " + element.getAttribute("id"));
}
}
}
In this post, we're using DOM Parser for parsing XML, you can use this link to become more familiar with other XML processing libraries
like: SAX Parser, StAX Parser and JAXB, these are much better than DOM Parser in terms of speed and performance.

XML parsing: Loop through a child node and save field,values to a hash map

I have an XML with the following structure.
<message>
<header>
</header>
<body>
</body>
<end>
</end>
</message>
Each header,body and end nodes contain fields that i need to extract into separate hash maps. What is the best way to go about this without using external libraries? The end result is to display a two-column view of the entire message. (field name, value)

You can use DocumentBuilderFactory and DocumentBuilder that comes along with java Api.
For Example, refer link.

It depends on the structure of your data and your hashmap: what is the key, what if the value.
Nevertheless, DOM and XPATH do the job:
String xml= // your xml
DocumentBuilderFactory builderFactory =DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
String expression="//header"; // Same for body, ...
XPathExpression expr = xpath.compile(expression) ;
NodeList nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int k = 0; k < nodes.getLength(); k++) {
// Do what you want with that
hope it helps

Java XML with namespace issue

I have this code:
org.w3c.dom.Document doc = docBuilder.parse(representation.getStream());
Element element = doc.getDocumentElement();
NodeList nodeList = element.getElementsByTagName("xnat:MRSession.scan.file");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// do something with the current element
my problem is with getElementsByTagName("xnat:MRSession.scan.file")
my xml looks like this:
<?xml version="1.0" encoding="UTF-8"?><xnat:MRSession "REMOVED DATA IGNORE">
<xnat:sharing>
<xnat:share label="23_MR1" project="BOGUS_GSU">
<!--hidden_fields[xnat_experimentData_share_id="1",sharing_share_xnat_experimentDa_id="xnat_E00001"]-->
</xnat:share>
</xnat:sharing>
<xnat:fields>
<xnat:field name="studyComments">
<!--hidden_fields[xnat_experimentData_field_id="1",fields_field_xnat_experimentDat_id="xnat_E00001"]-->S</xnat:field>
</xnat:fields>
<xnat:subject_ID>xnat_S00002</xnat:subject_ID>
<xnat:scanner manufacturer="GE MEDICAL SYSTEMS" model="GENESIS_SIGNA"/>
<xnat:prearchivePath>/home/ryan/xnat_data/prearchive/BOGUS_OUA/20120717_131900137/23_MR1</xnat:prearchivePath>
<xnat:scans>
<xnat:scan ID="1" UID="1.2.840.113654.2.45.2.108830" type="SAG LOCALIZER" xsi:type="xnat:mrScanData">
<!--hidden_fields[xnat_imageScanData_id="1"]-->
<xnat:image_session_ID>xnat_E00001</xnat:image_session_ID>
<xnat:quality>usable</xnat:quality>
<xnat:series_description>SAG LOCALIZER</xnat:series_description>
<xnat:scanner manufacturer="GE MEDICAL SYSTEMS" model="GENESIS_SIGNA"/>
<xnat:frames>29</xnat:frames>
<xnat:file URI="/home/ryan/xnat_data/archive/BOGUS_OUA/arc001/23_MR1/SCANS/1/DICOM/scan_1_catalog.xml" content="RAW" file_count="29" file_size="3968052" format="DICOM" label="DICOM" xsi:type="xnat:resourceCatalog">
So Basically I need to be able to iterate through all the xnat:MRSession/xnat:scan/xnat:file
elements and make some changes. Problem is
getElementsByTagName("xnat:MRSession.scan.file")
Is always null. Please help. Thanks

You could try the following using XPath:
Document document = // the parsed document
XPathFactory xPathFactory = XPathFactory.newInstance();
NodeList allFileNodes = xPathFactory.newXPath().evaluate("\\XNAT_NAMESPACE:file", document.getDocumentElement(), XPathConstants.NODESET);
Instead XNAT_NAMESPACE you would need to specify the exact namespace that is meant with the prefix "xnat" in your example.

DOM Parser query in JAVA

<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="SPECIALNOTE"></code>
<text><![CDATA[<strong>** New York State approval pending. This test is not available for New York State patient testing **</br> ]]></text>
</annotation>
</subjectOf>
<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="PREFERREDSPECIMEN"></code>
<text><![CDATA[2 mL Second void urine <strong>or </strong>2-hour urine <strong>or </strong> 2 mL Urine with no preservative]]></text>
</annotation>
</subjectOf>
In DOM parsing, how can I traverse through the above XML and get the <text> tag value depending upon a <code> tag attribute having a given value. For example, I want to get the following text:
<strong>** New York State approval pending. This test is not available
for New York State patient testing **</br>
...based on the <code> tag with a code attribute where value="SPECIALNOTE".
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("xml.xml");
XPath xpath = XPathFactory.newInstance().newXPath(); // XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("/testCodeIdentifier/subjectOf/subjectOf/annotation/code[#code='SPECIALNOTE']");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("........"+nodes.item(i).getNodeValue()+"........");
}
}
}
Appreciate the help in advance...

First, your XPath expression has an error; subjectOf is repeated unnecessarily:
/subjectOf/subjectOf
Now, assuming you really do need a reference to the code node that precedes the target text element, then use the following:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation/code[#code='SPECIALNOTE']");
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(getNextElementSibling(node).getTextContent());
Where getNextElementSibling is defined as follows:
public static Node getNextElementSibling(Node node) {
Node next = node;
do {
next = next.getNextSibling();
} while ((next != null) && (next.getNodeType() != Node.ELEMENT_NODE));
return next;
}
A couple of notes about this:
The reason that getNextSibling did not originally work for you is (most likely) because the next sibling of the referenced code element is a text node, not an element node. (The whitespace between code and text is significant.) That's why we need getNextElementSibling.
We're selecting a single node, so we're using XPathConstants.NODE instead if XPathConstants.NODELIST
Note that you should probably just do as #Lukas suggests and modify your XPath expression to directly select the target text.
Here's how to get the text directly (as a String):
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text/text()");
String text = (String) expr.evaluate(doc, XPathConstants.STRING);
System.out.println(text);
Here's how to first get a reference to the element and then retrieve the contents of its CDATA section:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text");
Node text = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(text.getTextContent());

Fix your XPath expression like this:
/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text
You could then, for instance, access the CDATA content using
Node.getTextContent();
UPDATE: The above XPath seemed correct at the time I posted it. In the meantime, you have completely changed your XML code and now, the XPath would read
/testCodeIdentifier/subjectOf/code/subjectOf/annotation[code/#code='SPECIALNOTE']/text
Or, because I am guessing that this question is so messy, it's still wrong, just do:
//annotation[code/#code='SPECIALNOTE']/text

Finally i have got the answer for my question by myself.... Below code is being working for my XML to be parsed...
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//testCodeIdentifier/subjectOf/order/subjectOf/annotation/code[#code='SPECIALNOTE']/following-sibling::text/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
Thank you people who have ansewered in this post but this is a possible solution for it. Have a mark on it.

How to update XML using XPath and Java

I have an XML document, and an XPath expression for that doc. I have to update the doc by using XPath at runtime.
How can I do this using Java?
The below is my xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<PersonList>
<Person>
<Name>Sonu Kapoor</Name>
<Age>24</Age>
<Gender>M</Gender>
<PostalCode>54879</PostalCode>
</Person>
<Person>
<Name>Jasmin</Name>
<Age>28</Age>
<Gender>F</Gender>
<PostalCode>78745</PostalCode>
</Person>
<Person>
<Name>Josef</Name>
<Age>232</Age>
<Gender>F</Gender>
<PostalCode>53454</PostalCode>
</Person>
</PersonList>
I have to change the values of name and age under //PersonList/Person[2]/Name.

Use setNodeValue. First, get a NodeList, for example:
myNodeList = (NodeList) xpath.compile("//MyXPath/text()")
.evaluate(myXmlDoc, XPathConstants.NODESET);
Then set the value of e.g. the first node:
myNodeList.item(0).setNodeValue("Hi mom!");
More examples e.g. here.
As mentioned in two other answers here, as well as in your previous question: technically, XPath is not a way to "update" an XML document, but only to locate nodes within an XML document. But I presume the above is what you want.
EDIT: Responding to your comment... Are you asking how to write your DOM to an XML file after you've finished editing the DOM? If so, here are two examples of how to do it:
http://www.java2s.com/Code/Java/XML/WriteDOMout.htm
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPXSLT4.html

You can delete the file and create a new one.
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(
new InputSource("data.xml"));
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//employee/name[text()='old']", doc,
XPathConstants.NODESET);
for (int idx = 0; idx < nodes.getLength(); idx++) {
nodes.item(idx).setTextContent("new value");
}
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File("data_new.xml")));

XPath is used to select parts of an XML document.It has no provision for updating. But since it returns DOM objects (Elements, if memory serves, or maybe Nodes) you can then use DOM methods for altering the document.

XPath can be used to select nodes in a document, not for modification
You apply the xpath expression to your document and get an element (in your case). Once you have this Element, you can use the Element methods to change values (name and age in your case)
Starting from a NodeList it should work like that:
NodeList nodes = getNodeListFromXPathExpression(); // you know how
if (nodes.length == 0)
return; // empty nodelist, xpath didn't select anything
Node first = node.getItem(0); // take the first from the list, your element
// this is a shortcut for your example:
// first is the actual selected element (a node)
// .getFirst() returns the first child node, the "text node" (="Jasmine", ="28")
// .setNodeValue() replace the actual value of that text node with a new string
first.getFirstChild().setNodeValue("New Name or new age");

Consider using XQuery Update instead of XPath. This allows you to write
replace value of node //PersonList/Person[2]/Name with "Anonymous"
This is much easier than using the Java DOM API.

I've created a small project for using XPATH to create/update XML:
https://github.com/shenghai/xmodifier
the code to change your xml is like:
Document document = readDocument("personList.xml");
XModifier modifier = new XModifier(document);
modifier.addModify("//PersonList/Person[2]/Name", "newName");
modifier.modify();

This is a super cool function where you can able to modify any tag value for any XML document using its xpath. You need to pass three arguments xml,xpathExpression and newValue and it returns the XML file as String with modified value.
If you want to pass XML as file, you need to change the function accordingly. But the logic will be same.
public String updateXML(String xml, String xpathExpression, String newValue)
{
try {
//Creating document builder
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new org.xml.sax.InputSource(new StringReader(xml)));
//Evaluating xpath expression using Element
XPath xpath = XPathFactory.newInstance().newXPath();
Element element = (Element)xpath.evaluate(xpathExpression, document, XPathConstants.NODE);
//Setting value in the text
element.setTextContent(value);
//Transformation of document to xml
StringWriter stringWriter = new StringWriter();
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(document), new StreamResult(stringWriter));
xml = stringWriter.toString();
}
catch (Exception e)
{
e.printStackTrace();
}
return xml;
}

Here is the code to change the content with vtd-xml... vtd-xml is unique in that it is the only API that offers incremental update capability.
import com.ximpleware.*;
import java.io.*;
public class changeName {
public static void main(String s[]) throws VTDException,java.io.UnsupportedEncodingException,java.io.IOException{
VTDGen vg = new VTDGen();
if (!vg.parseFile("input.xml", false))
return;
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
XMLModifier xm = new XMLModifier(vn);
ap.selectXPath("//PersonList/Person[2]");
int i=0;
while((i=ap.evalXPath())!=-1){
if (vn.toElement(VTDNav.FIRST_CHILD,"Name")){
int k=vn.getText();
if (i!=-1)
xm.updateToken(k, "Jonathan");
vn.toElement(VTDNav.PARENT);
}
if (vn.toElement(VTDNav.FIRST_CHILD,"Age")){
int k=vn.getText();
if (i!=-1)
xm.updateToken(k, "42");
vn.toElement(VTDNav.PARENT);
}
}
xm.output("new.xml");
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

DOM parsing in Java not able to get the nested notes - java

Related

Best way to reach the tag I want in an XML file when it's repeated?

XML parsing: Loop through a child node and save field,values to a hash map

Java XML with namespace issue

DOM Parser query in JAVA

How to update XML using XPath and Java

Categories

Resources