DOM Parser query in JAVA

DOM Parser query in JAVA - java

<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="SPECIALNOTE"></code>
<text><![CDATA[<strong>** New York State approval pending. This test is not available for New York State patient testing **</br> ]]></text>
</annotation>
</subjectOf>
<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="PREFERREDSPECIMEN"></code>
<text><![CDATA[2 mL Second void urine <strong>or </strong>2-hour urine <strong>or </strong> 2 mL Urine with no preservative]]></text>
</annotation>
</subjectOf>
In DOM parsing, how can I traverse through the above XML and get the <text> tag value depending upon a <code> tag attribute having a given value. For example, I want to get the following text:
<strong>** New York State approval pending. This test is not available
for New York State patient testing **</br>
...based on the <code> tag with a code attribute where value="SPECIALNOTE".
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("xml.xml");
XPath xpath = XPathFactory.newInstance().newXPath(); // XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("/testCodeIdentifier/subjectOf/subjectOf/annotation/code[#code='SPECIALNOTE']");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("........"+nodes.item(i).getNodeValue()+"........");
}
}
}
Appreciate the help in advance...

First, your XPath expression has an error; subjectOf is repeated unnecessarily:
/subjectOf/subjectOf
Now, assuming you really do need a reference to the code node that precedes the target text element, then use the following:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation/code[#code='SPECIALNOTE']");
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(getNextElementSibling(node).getTextContent());
Where getNextElementSibling is defined as follows:
public static Node getNextElementSibling(Node node) {
Node next = node;
do {
next = next.getNextSibling();
} while ((next != null) && (next.getNodeType() != Node.ELEMENT_NODE));
return next;
}
A couple of notes about this:
The reason that getNextSibling did not originally work for you is (most likely) because the next sibling of the referenced code element is a text node, not an element node. (The whitespace between code and text is significant.) That's why we need getNextElementSibling.
We're selecting a single node, so we're using XPathConstants.NODE instead if XPathConstants.NODELIST
Note that you should probably just do as #Lukas suggests and modify your XPath expression to directly select the target text.
Here's how to get the text directly (as a String):
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text/text()");
String text = (String) expr.evaluate(doc, XPathConstants.STRING);
System.out.println(text);
Here's how to first get a reference to the element and then retrieve the contents of its CDATA section:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text");
Node text = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(text.getTextContent());

Fix your XPath expression like this:
/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text
You could then, for instance, access the CDATA content using
Node.getTextContent();
UPDATE: The above XPath seemed correct at the time I posted it. In the meantime, you have completely changed your XML code and now, the XPath would read
/testCodeIdentifier/subjectOf/code/subjectOf/annotation[code/#code='SPECIALNOTE']/text
Or, because I am guessing that this question is so messy, it's still wrong, just do:
//annotation[code/#code='SPECIALNOTE']/text

Finally i have got the answer for my question by myself.... Below code is being working for my XML to be parsed...
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//testCodeIdentifier/subjectOf/order/subjectOf/annotation/code[#code='SPECIALNOTE']/following-sibling::text/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
Thank you people who have ansewered in this post but this is a possible solution for it. Have a mark on it.

Related

XML parsing: Loop through a child node and save field,values to a hash map

I have an XML with the following structure.
<message>
<header>
</header>
<body>
</body>
<end>
</end>
</message>
Each header,body and end nodes contain fields that i need to extract into separate hash maps. What is the best way to go about this without using external libraries? The end result is to display a two-column view of the entire message. (field name, value)

You can use DocumentBuilderFactory and DocumentBuilder that comes along with java Api.
For Example, refer link.

It depends on the structure of your data and your hashmap: what is the key, what if the value.
Nevertheless, DOM and XPATH do the job:
String xml= // your xml
DocumentBuilderFactory builderFactory =DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
String expression="//header"; // Same for body, ...
XPathExpression expr = xpath.compile(expression) ;
NodeList nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int k = 0; k < nodes.getLength(); k++) {
// Do what you want with that
hope it helps

DOM parsing in Java not able to get the nested notes

I have to parse an xml file in which I have many name value pairs.
I have to update the value in case it matches a given name.
I opted for DOM parsing as it can easily traverse any part and can quickly update the value.
It is however giving me some wired results when I am running it on my sample file.
I am new to DOM so if someone can help it can solve my problem.
I tried various things but all resulting in either null values for content or #text node name.
I am not able to get the text content of the tag.
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(xmlFilePath);
//This will get the first NVPair
Node NVPairs = document.getElementsByTagName("NVPairs").item(0);
//This should assign nodes with all the child nodes of NVPairs. This should be ideally
//<nameValuePair>
NodeList nodes = NVPairs.getChildNodes();
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
// I think it will consider both starting and closing tag as node so checking for if it has
//child
if(node.hasChildNodes())
{
//This should give me the content in the name tag.
//However this is not happening
if ("Tom".equals(node.getFirstChild().getTextContent())) {
node.getLastChild().setTextContent("2000000");
}
}
}
Sample xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?><application>
<NVPairs>
<nameValuePair>
<name>Tom</name>
<value>12</value>
</nameValuePair>
<nameValuePair>
<name>Sam</name>
<value>121</value>
</nameValuePair>
</NVPairs>

#getChildNodes() and #getFirstChild() returns all kinds of nodes, not just Element nodes, and in this case the first child of <name>Tom</name> is a Text node (with newline and blanks). So your test will never return true.
However, in cases like this, it always much more convenient to use XPath:
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate(
"//nameValuePair/value[preceding-sibling::name = 'Tom']", document,
XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
node.setTextContent("2000000");
}
I.e., return all <name> elements that has a preceding sibling element <name> with value 'Tom'.

Java XML with namespace issue

I have this code:
org.w3c.dom.Document doc = docBuilder.parse(representation.getStream());
Element element = doc.getDocumentElement();
NodeList nodeList = element.getElementsByTagName("xnat:MRSession.scan.file");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// do something with the current element
my problem is with getElementsByTagName("xnat:MRSession.scan.file")
my xml looks like this:
<?xml version="1.0" encoding="UTF-8"?><xnat:MRSession "REMOVED DATA IGNORE">
<xnat:sharing>
<xnat:share label="23_MR1" project="BOGUS_GSU">
<!--hidden_fields[xnat_experimentData_share_id="1",sharing_share_xnat_experimentDa_id="xnat_E00001"]-->
</xnat:share>
</xnat:sharing>
<xnat:fields>
<xnat:field name="studyComments">
<!--hidden_fields[xnat_experimentData_field_id="1",fields_field_xnat_experimentDat_id="xnat_E00001"]-->S</xnat:field>
</xnat:fields>
<xnat:subject_ID>xnat_S00002</xnat:subject_ID>
<xnat:scanner manufacturer="GE MEDICAL SYSTEMS" model="GENESIS_SIGNA"/>
<xnat:prearchivePath>/home/ryan/xnat_data/prearchive/BOGUS_OUA/20120717_131900137/23_MR1</xnat:prearchivePath>
<xnat:scans>
<xnat:scan ID="1" UID="1.2.840.113654.2.45.2.108830" type="SAG LOCALIZER" xsi:type="xnat:mrScanData">
<!--hidden_fields[xnat_imageScanData_id="1"]-->
<xnat:image_session_ID>xnat_E00001</xnat:image_session_ID>
<xnat:quality>usable</xnat:quality>
<xnat:series_description>SAG LOCALIZER</xnat:series_description>
<xnat:scanner manufacturer="GE MEDICAL SYSTEMS" model="GENESIS_SIGNA"/>
<xnat:frames>29</xnat:frames>
<xnat:file URI="/home/ryan/xnat_data/archive/BOGUS_OUA/arc001/23_MR1/SCANS/1/DICOM/scan_1_catalog.xml" content="RAW" file_count="29" file_size="3968052" format="DICOM" label="DICOM" xsi:type="xnat:resourceCatalog">
So Basically I need to be able to iterate through all the xnat:MRSession/xnat:scan/xnat:file
elements and make some changes. Problem is
getElementsByTagName("xnat:MRSession.scan.file")
Is always null. Please help. Thanks

You could try the following using XPath:
Document document = // the parsed document
XPathFactory xPathFactory = XPathFactory.newInstance();
NodeList allFileNodes = xPathFactory.newXPath().evaluate("\\XNAT_NAMESPACE:file", document.getDocumentElement(), XPathConstants.NODESET);
Instead XNAT_NAMESPACE you would need to specify the exact namespace that is meant with the prefix "xnat" in your example.

How to get value in XML using Java?

I have tried a lot of times but I did not how to retrive a value from XML using Java. I tried to use DOM and Xpath. Please help. I can use a String Writer to printout the XML so I know the XML is not empty.
Document doc = parseXML(connection.getInputStream());
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xPath.compile("/xml_api_reply/weather/current_conditions/temp_f/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
The content of XML :
<xml_api_reply version="1">
<weather module_id="0" tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0">
<current_conditions>
<condition data="Clear"/>
<temp_f data="49"/>
<temp_c data="9"/>
</current_conditions>
</weather>
</xml_api_reply>
It seems that it did not go in to the for loop because nodes is null.

Your XPath expression evaluates to the (non existant) text-nodes underneath temp_f. Yet, you need the value of the data attribute:
/xml_api_reply/weather/current_conditions/temp_f/#data
may do the trick.

Did you try this implementation?
http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/

XPathExpression not evaluatng in the proper context?

I'm trying to parse some XML from the USGS.
Here's an example
The "parameterCd" parameter lists the 3 items of data I want back. I may or may not get all 3 back.
I'm doing this on an Android using the javax libraries.
In my code, I initially retrieve the 0-3 ns1:timeSeries nodes. This works fine. What I then want to do is, within the context of a single timeSeries node, retrieve the ns1:variable and ns1:values nodes.
So in my code below where I have:
expr = xpath.compile("//ns1:variable");
NodeList variableNodes = (NodeList) expr.evaluate(timeSeriesNode, XPathConstants.NODESET);
I would expect to only get back one node, since the evaluate SHOULD be happening in the context of the single timeSeriesNode that I'm passing in (according to the documentation). Instead, however, it returns all of the ns1:variable nodes for the document, however.
Am I missing something?
Here's the relevant portions...
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(new InstantaneousValuesNamespaceContext());
XPathExpression expr;
NodeList timeSeriesNodes = null;
InputStream is = new ByteArrayInputStream(sourceXml.getBytes());
try {
expr = xpath.compile("//ns1:timeSeries");
timeSeriesNodes = (NodeList) expr.evaluate(new InputSource(is), XPathConstants.NODESET);
for(int timeSeriesIndex = 0;timeSeriesIndex < timeSeriesNodes.getLength(); timeSeriesIndex++){
Node timeSeriesNode = timeSeriesNodes.item(timeSeriesIndex);
expr = xpath.compile("//ns1:variable");
NodeList variableNodes = (NodeList) expr.evaluate(timeSeriesNode, XPathConstants.NODESET);
// Problem here. I've got all the variables, not the individual one I want.
for(int variableIndex = 0; variableIndex < variableNodes.getLength(); variableIndex++){
Node variableNode = variableNodes.item(variableIndex);
expr = xpath.compile("//ns1:valueType");
NodeList valueTypeNodes = (NodeList) expr.evaluate(variableNode, XPathConstants.NODESET);
}
}
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Try changing
//ns1:variable
to
.//ns1:variable
Even though, as the docs say, the expression is evaluated within the context of the current node, // is special and (unless modified) always means 'search the whole document from the root'. By putting the . in, you force the meaning you want, 'search the whole tree from this point downwards'.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

DOM Parser query in JAVA - java

Related

XML parsing: Loop through a child node and save field,values to a hash map

DOM parsing in Java not able to get the nested notes

Java XML with namespace issue

How to get value in XML using Java?

XPathExpression not evaluatng in the proper context?

Categories

Resources