Retrieve multiple attributes with XPath

Retrieve multiple attributes with XPath - java

I have an XML file that's similar to this (each element has more attributes):
<DocBuild>
<XMLDependency name="Name1" product="Product ABC" area="JKL" />
<XMLDependency name="Name2" product="Product DEF" area="MNO" />
<XMLDependency name="Name3" product="Product GHI" area="PQR" />
</DocBuild>
I want to retrieve each 'name' attribute and the 'area' for that element so I can build a list that looks like this (I've inserted a dash between 'name' and 'area' for clarity):
Name1-JKL
Name2-MNO
Name3-PQR
public static Element getConfig(...) throws XPathExpressionException{
String path = MessageFormat.format("//DocBuild//XMLDependency[#name='Name1']//#area ")
}

Use some rules like regular expressions! In this case you must use "|" which is used as OR clause.
// Create XPathFactory object
XPathFactory xpathFactory = XPathFactory.newInstance();
// Create XPath object
XPath xpath = xpathFactory.newXPath();
String name = null;
try {
XPathExpression expr =
xpath.compile("/DocBuild/XMLDependency[#name='Name1']//#name|/DocBuild/XMLDependency[#name='Name1']//#area");
NodeList nl = (NodeList)expr.evaluate(doc,XPathConstants.NODESET);
String nameAttr = "";
for (int index = nl.getLength()-1; index >= 0; index--) {
Node node = nl.item(index);
nameAttr += node.getTextContent();
nameAttr += "-";
}
nameAttr = nameAttr.substring(0,nameAttr.lastIndexOf("-"));
System.out.println(nameAttr);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
See XPath Syntax

This XPath 2.0 expression,
/DocBuild/XMLDependency/concat(#name,'-',#area)
evaluates directly to
Name1-JKL
Name2-MNO
Name3-PQR
for your sample XML, as requested.

Related

Getting a node from an XML document

I use the worldweatheronline API. The service gives xml in the following form:
<hourly>
<tempC>-3</tempC>
<weatherDesc>rain</weatherDesc>
<precipMM>0.0</precipMM>
</hourly>
<hourly>
<tempC>5</tempC>
<weatherDesc>no</weatherDesc>
<precipMM>0.1</precipMM>
</hourly>
Can I somehow get all the nodes <hourly> in which <tempC>> 0 and <weatherDesc> = rain?
How to exclude from the response the nodes that are not interesting to me <hourly>?

This is quite feasible using XPath.
You can filter a document based on element values, attribute values and other criteria.
Here is a working example that gets the elements according to the first point in the question:
try (InputStream is = Files.newInputStream(Paths.get("C:/temp/test.xml"))) {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document xmlDocument = builder.parse(is);
XPath xPath = XPathFactory.newInstance().newXPath();
// get hourly elements that have tempC child element with value > 0 and weatherDesc child element with value = "rain"
String expression = "//hourly[tempC>0 and weatherDesc=\"rain\"]";
NodeList hours = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < hours.getLength(); i++) {
System.out.println(hours.item(i) + " " + hours.item(i).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}

I think you should create xsd from xml and generate JAXB classes.Using those JAXB class you can easily unmarshal the xml and process your logic.

Java XML XPath Full XML

got a little problem. I have the following code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("result1.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//element");
String elements = (String) expr.evaluate(doc, XPathConstants.STRING);
What i get :
jcruz0#exblog.jp
Cheryl
Blake
195115
What i want:
<person>
<email>jcruz0#exblog.jp</email>
<firstname>Cheryl</firstname>
<lastname>Blake</lastname>
<number>195115</number>
</person>
So as you can see i want the full XML tree. Not just the NodeValue.
Maybe somebody knows the trick.
Thanks for any help.

You got the string value of the selected XML element because you specified XPathConstants.STRING to XPathExpression.evaluate().
Instead, specify a return type of XPathConstants.NODE if you know for sure that your XPath will select a single element,
String elements = (String) expr.evaluate(doc, XPathConstants.NODE);
or XPathConstants.NODESET for multiple elements, which you would then iterate over to process as necessary.

Something like this can be done.
XPathExpression expr = xpath.compile("/person");
NodeList elements = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < elements.getLength(); i++) {
// the person node
System.out.println(elements.item(i).getNodeName());
for (int x = 0; x < elements.item(i).getChildNodes().getLength(); x++) {
// the elements under person
if (elements.item(i).getChildNodes().item(x).getNodeType() == Node.ELEMENT_NODE) {
System.out.println("\t" + elements.item(i).getChildNodes().item(x).getNodeName() + " - " + elements.item(i).getChildNodes().item(x).getTextContent());
}
}
}
Output
person
email - jcruz0#exblog.jp
firstname - Cheryl
lastname - Blake
number - 195115
You can use the nodes to do what you want, or wrap them in < and > if you just want to print them.

Java, XPath Expression to read all node names, node values, and attributes

I need help in make an xpath expression to read all node names, node values, and attributes in an xml string. I made this:
private List<String> listOne = new ArrayList<String>();
private List<String> listTwo = new ArrayList<String>();
public void read(String xml) {
try {
// Turn String into a Document
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
// Setup XPath to retrieve all tags and values
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(node.getNodeName());
listTwo.add(node.getNodeValue());
// Another list to hold attributes
}
} catch(Exception e) {
LogHandle.info(e.getMessage());
}
}
I found the expression //text()[normalize-space()=''] online; however, it doesn't work. When I get try to get the node name from listOne, it is just #text. I tried //, but that doesn't work either. If I had this XML:
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
listOne[0] should hold Data, listOne[1] should hold Test, listTwo[1] should hold blah, etc... All the attributes will be saved in another parallel list.
What expression should xPath evaluate?
Note: The XML String can have different tags, so I can't hard code anything.
Update: Tried this loop:
NodeList nodeList = (NodeList) xPath.evaluate("//*", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(i, node.getNodeName());
// If null then must be text node
if(node.getChildNodes() == null)
listTwo.add(i, node.getTextContent());
}
However, this only gets the root element Data, then just stops.

//* will select all element nodes, //#* all attribute nodes. However, an element node does not have a meaningful node value in the DOM, so you would need to read out getTextContent() instead of getNodeValue.
As you seem to consider an element with child elements to have a "null" value I think you need to check whether there are any child elements:
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse("sampleInput1.xml");
XPathFactory fact = XPathFactory.newInstance();
XPath xpath = fact.newXPath();
NodeList allElements = (NodeList)xpath.evaluate("//*", doc, XPathConstants.NODESET);
ArrayList<String> elementNames = new ArrayList<>();
ArrayList<String> elementValues = new ArrayList<>();
for (int i = 0; i < allElements.getLength(); i++)
{
Node currentElement = allElements.item(i);
elementNames.add(i, currentElement.getLocalName());
elementValues.add(i, xpath.evaluate("*", currentElement, XPathConstants.NODE) != null ? null : currentElement.getTextContent());
}
for (int i = 0; i < elementNames.size(); i++)
{
System.out.println("Name: " + elementNames.get(i) + "; value: " + (elementValues.get(i)));
}
For the sample input
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
the output is
Name: Data; value: null
Name: Test; value: blah
Name: Foo; value: bar
Name: Date; value: 12242016
Name: Phone; value: null
Name: Home; value: 5555555555
Name: Mobile; value: 5555556789

Get max Id from xml code using Xpath

I have a XML file that contains two elements: Project and Layer. I want to get attribute idLayer with the highest number using java. My code is not working properly:
public int GetMaxID() throws JAXBException {
try {
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String expression = "//Project/Layer/#idLayer[not(. <=../preceding-sibling::Layer/#idLayer) and not(. <=../following-sibling::Layer/#idLayer)]";
XPathExpression xPathExpression = xPath.compile(expression);
InputSource doc = newInputSource(newInputStreamReader(newFileInputStream(newFile("Projects//asdad//ProjectDataBase.xml"))));
NodeList elem1List = (NodeList) xPathExpression.evaluate(doc, XPathConstants.NODESET);
int maxId = elem1List.getLength();//give me 0
} catch (Exception e) {
e.printStackTrace();
}
return -1;
}
My XML code:
<tns:Project xmlns:tns="http://www.example.org/ProjectDataBase" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.org/ProjectDataBase ProjectDataBase.xsd ">
<tns:Layer idLayer="1">
<tns:LayerName>tns:LayerName1</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="2">
<tns:LayerName>tns:LayerName2</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="3">
<tns:LayerName>tns:LayerName3</tns:LayerName>
</tns:Layer>
</tns:Project>
Can you point me to the right direction?

Your problem is the tns namespace. You don't use it in your XPath expression, therefore it cannot select anything.
There are countless examples of how to register XML namespaces with JDOM, for example this one.
Also, your XPath is way too complicated.
Use //tns:Project/tns:Layer[not(#idLayer < ../tns:Layer/#idLayer)]/#idLayer.
Mind that this does not give the maximum node, but all the maximum nodes - there could be more than one.

DOM Parser query in JAVA

<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="SPECIALNOTE"></code>
<text><![CDATA[<strong>** New York State approval pending. This test is not available for New York State patient testing **</br> ]]></text>
</annotation>
</subjectOf>
<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="PREFERREDSPECIMEN"></code>
<text><![CDATA[2 mL Second void urine <strong>or </strong>2-hour urine <strong>or </strong> 2 mL Urine with no preservative]]></text>
</annotation>
</subjectOf>
In DOM parsing, how can I traverse through the above XML and get the <text> tag value depending upon a <code> tag attribute having a given value. For example, I want to get the following text:
<strong>** New York State approval pending. This test is not available
for New York State patient testing **</br>
...based on the <code> tag with a code attribute where value="SPECIALNOTE".
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("xml.xml");
XPath xpath = XPathFactory.newInstance().newXPath(); // XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("/testCodeIdentifier/subjectOf/subjectOf/annotation/code[#code='SPECIALNOTE']");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("........"+nodes.item(i).getNodeValue()+"........");
}
}
}
Appreciate the help in advance...

First, your XPath expression has an error; subjectOf is repeated unnecessarily:
/subjectOf/subjectOf
Now, assuming you really do need a reference to the code node that precedes the target text element, then use the following:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation/code[#code='SPECIALNOTE']");
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(getNextElementSibling(node).getTextContent());
Where getNextElementSibling is defined as follows:
public static Node getNextElementSibling(Node node) {
Node next = node;
do {
next = next.getNextSibling();
} while ((next != null) && (next.getNodeType() != Node.ELEMENT_NODE));
return next;
}
A couple of notes about this:
The reason that getNextSibling did not originally work for you is (most likely) because the next sibling of the referenced code element is a text node, not an element node. (The whitespace between code and text is significant.) That's why we need getNextElementSibling.
We're selecting a single node, so we're using XPathConstants.NODE instead if XPathConstants.NODELIST
Note that you should probably just do as #Lukas suggests and modify your XPath expression to directly select the target text.
Here's how to get the text directly (as a String):
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text/text()");
String text = (String) expr.evaluate(doc, XPathConstants.STRING);
System.out.println(text);
Here's how to first get a reference to the element and then retrieve the contents of its CDATA section:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text");
Node text = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(text.getTextContent());

Fix your XPath expression like this:
/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text
You could then, for instance, access the CDATA content using
Node.getTextContent();
UPDATE: The above XPath seemed correct at the time I posted it. In the meantime, you have completely changed your XML code and now, the XPath would read
/testCodeIdentifier/subjectOf/code/subjectOf/annotation[code/#code='SPECIALNOTE']/text
Or, because I am guessing that this question is so messy, it's still wrong, just do:
//annotation[code/#code='SPECIALNOTE']/text

Finally i have got the answer for my question by myself.... Below code is being working for my XML to be parsed...
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//testCodeIdentifier/subjectOf/order/subjectOf/annotation/code[#code='SPECIALNOTE']/following-sibling::text/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
Thank you people who have ansewered in this post but this is a possible solution for it. Have a mark on it.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Retrieve multiple attributes with XPath - java

This XPath 2.0 expression, /DocBuild/XMLDependency/concat(#name,'-',#area) evaluates directly to Name1-JKL Name2-MNO Name3-PQR for your sample XML, as requested.

Related

Getting a node from an XML document

Java XML XPath Full XML

Java, XPath Expression to read all node names, node values, and attributes

Get max Id from xml code using Xpath

DOM Parser query in JAVA

Categories

Resources