I'm trying to parse some XML from the USGS.
Here's an example
The "parameterCd" parameter lists the 3 items of data I want back. I may or may not get all 3 back.
I'm doing this on an Android using the javax libraries.
In my code, I initially retrieve the 0-3 ns1:timeSeries nodes. This works fine. What I then want to do is, within the context of a single timeSeries node, retrieve the ns1:variable and ns1:values nodes.
So in my code below where I have:
expr = xpath.compile("//ns1:variable");
NodeList variableNodes = (NodeList) expr.evaluate(timeSeriesNode, XPathConstants.NODESET);
I would expect to only get back one node, since the evaluate SHOULD be happening in the context of the single timeSeriesNode that I'm passing in (according to the documentation). Instead, however, it returns all of the ns1:variable nodes for the document, however.
Am I missing something?
Here's the relevant portions...
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(new InstantaneousValuesNamespaceContext());
XPathExpression expr;
NodeList timeSeriesNodes = null;
InputStream is = new ByteArrayInputStream(sourceXml.getBytes());
try {
expr = xpath.compile("//ns1:timeSeries");
timeSeriesNodes = (NodeList) expr.evaluate(new InputSource(is), XPathConstants.NODESET);
for(int timeSeriesIndex = 0;timeSeriesIndex < timeSeriesNodes.getLength(); timeSeriesIndex++){
Node timeSeriesNode = timeSeriesNodes.item(timeSeriesIndex);
expr = xpath.compile("//ns1:variable");
NodeList variableNodes = (NodeList) expr.evaluate(timeSeriesNode, XPathConstants.NODESET);
// Problem here. I've got all the variables, not the individual one I want.
for(int variableIndex = 0; variableIndex < variableNodes.getLength(); variableIndex++){
Node variableNode = variableNodes.item(variableIndex);
expr = xpath.compile("//ns1:valueType");
NodeList valueTypeNodes = (NodeList) expr.evaluate(variableNode, XPathConstants.NODESET);
}
}
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Try changing
//ns1:variable
to
.//ns1:variable
Even though, as the docs say, the expression is evaluated within the context of the current node, // is special and (unless modified) always means 'search the whole document from the root'. By putting the . in, you force the meaning you want, 'search the whole tree from this point downwards'.
Related
I use the worldweatheronline API. The service gives xml in the following form:
<hourly>
<tempC>-3</tempC>
<weatherDesc>rain</weatherDesc>
<precipMM>0.0</precipMM>
</hourly>
<hourly>
<tempC>5</tempC>
<weatherDesc>no</weatherDesc>
<precipMM>0.1</precipMM>
</hourly>
Can I somehow get all the nodes <hourly> in which <tempC>> 0 and <weatherDesc> = rain?
How to exclude from the response the nodes that are not interesting to me <hourly>?
This is quite feasible using XPath.
You can filter a document based on element values, attribute values and other criteria.
Here is a working example that gets the elements according to the first point in the question:
try (InputStream is = Files.newInputStream(Paths.get("C:/temp/test.xml"))) {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document xmlDocument = builder.parse(is);
XPath xPath = XPathFactory.newInstance().newXPath();
// get hourly elements that have tempC child element with value > 0 and weatherDesc child element with value = "rain"
String expression = "//hourly[tempC>0 and weatherDesc=\"rain\"]";
NodeList hours = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < hours.getLength(); i++) {
System.out.println(hours.item(i) + " " + hours.item(i).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
I think you should create xsd from xml and generate JAXB classes.Using those JAXB class you can easily unmarshal the xml and process your logic.
Although I am able to set a text value inside a node with the code below
private static void setPhoneNumber(Document xmlDoc, String phoneNumber) {
Element root = xmlDoc.getDocumentElement();
Element phoneParent = (Element) root.getElementsByTagName("gl-bus:entityPhoneNumber").item(0);
Element phoneElement = (Element) phoneParent.getElementsByTagName("gl-bus:phoneNumber").item(0);
phoneElement.setTextContent(phoneNumber);
}
I cannot do the same with XPath because I get null for the node object
private static void setPhoneNumber(Document xmlDoc, String phoneNumber) {
try {
NodeList nodes = (NodeList) xPath.evaluate("/gl-cor:entityInformation/gl-bus:entityPhoneNumber/gl-bus:phoneNumber", xmlDoc, XPathConstants.NODESET);
Node node = nodes.item(0);
node.setTextContent(phoneNumber);
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
The fact that you're using the non-namespace-aware method getElementsByTagName(), passing it an element name containing a colon, suggests that you're not handling namespaces properly when you parse the XML. If your XML were parsed in a namespace-aware manner then this shouldn't have worked, but something like
String namespace = // the namespace URI bound to the gl-bus prefix in your doc
Element phoneParent = (Element) root.getElementsByTagNameNS(namespace, "entityPhoneNumber").item(0);
would work correctly. Note that the standard Java DocumentBuilderFactory is not namespace aware by default, you must call setNamespaceAware(true) on the factory before you ask it for a newDocumentBuilder.
XPath requires namespace-aware parsing, and if you want to access elements that are in a namespace via XPath then you must supply a NamespaceContext to the XPath object to tell it what prefix bindings to use - it does not inherit the prefix bindings from the original XML. Annoyingly there's no default implementation of NamespaceContext provided in the core Java library so you either have to write your own or use a third-party implementation such as Spring's SimpleNamespaceContext. With that:
SimpleNamespaceContext ctx = new SimpleNamespaceContext();
ctx.bindNamespaceUri("g", namespace); // the same URI as before
ctx.bindNamespaceUri("c", ...); // the namespace bound to gl-cor:
xPath.setNamespaceContext(ctx);
NodeList nodes = (NodeList) xPath.evaluate("/c:entityInformation/g:entityPhoneNumber/g:phoneNumber", xmlDoc, XPathConstants.NODESET);
I have to parse an xml file in which I have many name value pairs.
I have to update the value in case it matches a given name.
I opted for DOM parsing as it can easily traverse any part and can quickly update the value.
It is however giving me some wired results when I am running it on my sample file.
I am new to DOM so if someone can help it can solve my problem.
I tried various things but all resulting in either null values for content or #text node name.
I am not able to get the text content of the tag.
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(xmlFilePath);
//This will get the first NVPair
Node NVPairs = document.getElementsByTagName("NVPairs").item(0);
//This should assign nodes with all the child nodes of NVPairs. This should be ideally
//<nameValuePair>
NodeList nodes = NVPairs.getChildNodes();
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
// I think it will consider both starting and closing tag as node so checking for if it has
//child
if(node.hasChildNodes())
{
//This should give me the content in the name tag.
//However this is not happening
if ("Tom".equals(node.getFirstChild().getTextContent())) {
node.getLastChild().setTextContent("2000000");
}
}
}
Sample xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?><application>
<NVPairs>
<nameValuePair>
<name>Tom</name>
<value>12</value>
</nameValuePair>
<nameValuePair>
<name>Sam</name>
<value>121</value>
</nameValuePair>
</NVPairs>
#getChildNodes() and #getFirstChild() returns all kinds of nodes, not just Element nodes, and in this case the first child of <name>Tom</name> is a Text node (with newline and blanks). So your test will never return true.
However, in cases like this, it always much more convenient to use XPath:
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate(
"//nameValuePair/value[preceding-sibling::name = 'Tom']", document,
XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
node.setTextContent("2000000");
}
I.e., return all <name> elements that has a preceding sibling element <name> with value 'Tom'.
I have a XML file that contains two elements: Project and Layer. I want to get attribute idLayer with the highest number using java. My code is not working properly:
public int GetMaxID() throws JAXBException {
try {
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String expression = "//Project/Layer/#idLayer[not(. <=../preceding-sibling::Layer/#idLayer) and not(. <=../following-sibling::Layer/#idLayer)]";
XPathExpression xPathExpression = xPath.compile(expression);
InputSource doc = newInputSource(newInputStreamReader(newFileInputStream(newFile("Projects//asdad//ProjectDataBase.xml"))));
NodeList elem1List = (NodeList) xPathExpression.evaluate(doc, XPathConstants.NODESET);
int maxId = elem1List.getLength();//give me 0
} catch (Exception e) {
e.printStackTrace();
}
return -1;
}
My XML code:
<tns:Project xmlns:tns="http://www.example.org/ProjectDataBase" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.org/ProjectDataBase ProjectDataBase.xsd ">
<tns:Layer idLayer="1">
<tns:LayerName>tns:LayerName1</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="2">
<tns:LayerName>tns:LayerName2</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="3">
<tns:LayerName>tns:LayerName3</tns:LayerName>
</tns:Layer>
</tns:Project>
Can you point me to the right direction?
Your problem is the tns namespace. You don't use it in your XPath expression, therefore it cannot select anything.
There are countless examples of how to register XML namespaces with JDOM, for example this one.
Also, your XPath is way too complicated.
Use //tns:Project/tns:Layer[not(#idLayer < ../tns:Layer/#idLayer)]/#idLayer.
Mind that this does not give the maximum node, but all the maximum nodes - there could be more than one.
<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="SPECIALNOTE"></code>
<text><![CDATA[<strong>** New York State approval pending. This test is not available for New York State patient testing **</br> ]]></text>
</annotation>
</subjectOf>
<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="PREFERREDSPECIMEN"></code>
<text><![CDATA[2 mL Second void urine <strong>or </strong>2-hour urine <strong>or </strong> 2 mL Urine with no preservative]]></text>
</annotation>
</subjectOf>
In DOM parsing, how can I traverse through the above XML and get the <text> tag value depending upon a <code> tag attribute having a given value. For example, I want to get the following text:
<strong>** New York State approval pending. This test is not available
for New York State patient testing **</br>
...based on the <code> tag with a code attribute where value="SPECIALNOTE".
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("xml.xml");
XPath xpath = XPathFactory.newInstance().newXPath(); // XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("/testCodeIdentifier/subjectOf/subjectOf/annotation/code[#code='SPECIALNOTE']");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("........"+nodes.item(i).getNodeValue()+"........");
}
}
}
Appreciate the help in advance...
First, your XPath expression has an error; subjectOf is repeated unnecessarily:
/subjectOf/subjectOf
Now, assuming you really do need a reference to the code node that precedes the target text element, then use the following:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation/code[#code='SPECIALNOTE']");
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(getNextElementSibling(node).getTextContent());
Where getNextElementSibling is defined as follows:
public static Node getNextElementSibling(Node node) {
Node next = node;
do {
next = next.getNextSibling();
} while ((next != null) && (next.getNodeType() != Node.ELEMENT_NODE));
return next;
}
A couple of notes about this:
The reason that getNextSibling did not originally work for you is (most likely) because the next sibling of the referenced code element is a text node, not an element node. (The whitespace between code and text is significant.) That's why we need getNextElementSibling.
We're selecting a single node, so we're using XPathConstants.NODE instead if XPathConstants.NODELIST
Note that you should probably just do as #Lukas suggests and modify your XPath expression to directly select the target text.
Here's how to get the text directly (as a String):
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text/text()");
String text = (String) expr.evaluate(doc, XPathConstants.STRING);
System.out.println(text);
Here's how to first get a reference to the element and then retrieve the contents of its CDATA section:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text");
Node text = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(text.getTextContent());
Fix your XPath expression like this:
/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text
You could then, for instance, access the CDATA content using
Node.getTextContent();
UPDATE: The above XPath seemed correct at the time I posted it. In the meantime, you have completely changed your XML code and now, the XPath would read
/testCodeIdentifier/subjectOf/code/subjectOf/annotation[code/#code='SPECIALNOTE']/text
Or, because I am guessing that this question is so messy, it's still wrong, just do:
//annotation[code/#code='SPECIALNOTE']/text
Finally i have got the answer for my question by myself.... Below code is being working for my XML to be parsed...
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//testCodeIdentifier/subjectOf/order/subjectOf/annotation/code[#code='SPECIALNOTE']/following-sibling::text/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
Thank you people who have ansewered in this post but this is a possible solution for it. Have a mark on it.