XML - extracting value from specific node

XML - extracting value from specific node - java

im struggling with extracting value from a specific node in my XML document. Im using w3c.DOM as i have found many tutorials on it but now i cant find any good ones for this task - i had to use XPath for this task instead.
I always know the exact path (and passing it as a string, example: "Car/Wheels/Wheel[#Index=´x´]/" ) leading to a node from which i need to extract a value (a string) and return it (im converting the string into doubles and integers in other methods later). Variable myDoc is Document myDoc.
How do i get this value?
private String xPathValue(String path){
XPath myPath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(path);
String result = (String)expr.evaluate(myDoc);
return result;
}
This however doesnt work and i dont want to create any NodeList since i know the exact paths. Im looking for something that works like Node.getTextContent();

You have 2 options
1) Alter you xPath to return the value of the node instead of the node itself
Using expression: Car/Wheels/Wheel[#Index=´x´]/text()
private String xPathValue(String path) {
XPath myPath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(path);
String result = (String)expr.evaluate(myDoc, XPathConstants.STRING);
return result;
}
2) Use the same xpath query but return a node type
Using expression: Car/Wheels/Wheel[#Index=´x´]
private String xPathValue(String path) {
XPath myPath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(path);
Node result = (Node)expr.evaluate(myDoc, XPathConstants.NODE);
return result.getTextContent();
}

Related

Get max Id from xml code using Xpath

I have a XML file that contains two elements: Project and Layer. I want to get attribute idLayer with the highest number using java. My code is not working properly:
public int GetMaxID() throws JAXBException {
try {
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String expression = "//Project/Layer/#idLayer[not(. <=../preceding-sibling::Layer/#idLayer) and not(. <=../following-sibling::Layer/#idLayer)]";
XPathExpression xPathExpression = xPath.compile(expression);
InputSource doc = newInputSource(newInputStreamReader(newFileInputStream(newFile("Projects//asdad//ProjectDataBase.xml"))));
NodeList elem1List = (NodeList) xPathExpression.evaluate(doc, XPathConstants.NODESET);
int maxId = elem1List.getLength();//give me 0
} catch (Exception e) {
e.printStackTrace();
}
return -1;
}
My XML code:
<tns:Project xmlns:tns="http://www.example.org/ProjectDataBase" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.org/ProjectDataBase ProjectDataBase.xsd ">
<tns:Layer idLayer="1">
<tns:LayerName>tns:LayerName1</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="2">
<tns:LayerName>tns:LayerName2</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="3">
<tns:LayerName>tns:LayerName3</tns:LayerName>
</tns:Layer>
</tns:Project>
Can you point me to the right direction?

Your problem is the tns namespace. You don't use it in your XPath expression, therefore it cannot select anything.
There are countless examples of how to register XML namespaces with JDOM, for example this one.
Also, your XPath is way too complicated.
Use //tns:Project/tns:Layer[not(#idLayer < ../tns:Layer/#idLayer)]/#idLayer.
Mind that this does not give the maximum node, but all the maximum nodes - there could be more than one.

DOM Parser query in JAVA

<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="SPECIALNOTE"></code>
<text><![CDATA[<strong>** New York State approval pending. This test is not available for New York State patient testing **</br> ]]></text>
</annotation>
</subjectOf>
<subjectOf typeCode="SUBJ">
<annotation classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<code code="PREFERREDSPECIMEN"></code>
<text><![CDATA[2 mL Second void urine <strong>or </strong>2-hour urine <strong>or </strong> 2 mL Urine with no preservative]]></text>
</annotation>
</subjectOf>
In DOM parsing, how can I traverse through the above XML and get the <text> tag value depending upon a <code> tag attribute having a given value. For example, I want to get the following text:
<strong>** New York State approval pending. This test is not available
for New York State patient testing **</br>
...based on the <code> tag with a code attribute where value="SPECIALNOTE".
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("xml.xml");
XPath xpath = XPathFactory.newInstance().newXPath(); // XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("/testCodeIdentifier/subjectOf/subjectOf/annotation/code[#code='SPECIALNOTE']");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("........"+nodes.item(i).getNodeValue()+"........");
}
}
}
Appreciate the help in advance...

First, your XPath expression has an error; subjectOf is repeated unnecessarily:
/subjectOf/subjectOf
Now, assuming you really do need a reference to the code node that precedes the target text element, then use the following:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation/code[#code='SPECIALNOTE']");
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(getNextElementSibling(node).getTextContent());
Where getNextElementSibling is defined as follows:
public static Node getNextElementSibling(Node node) {
Node next = node;
do {
next = next.getNextSibling();
} while ((next != null) && (next.getNodeType() != Node.ELEMENT_NODE));
return next;
}
A couple of notes about this:
The reason that getNextSibling did not originally work for you is (most likely) because the next sibling of the referenced code element is a text node, not an element node. (The whitespace between code and text is significant.) That's why we need getNextElementSibling.
We're selecting a single node, so we're using XPathConstants.NODE instead if XPathConstants.NODELIST
Note that you should probably just do as #Lukas suggests and modify your XPath expression to directly select the target text.
Here's how to get the text directly (as a String):
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text/text()");
String text = (String) expr.evaluate(doc, XPathConstants.STRING);
System.out.println(text);
Here's how to first get a reference to the element and then retrieve the contents of its CDATA section:
XPathExpression expr = xpath.compile(
"/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text");
Node text = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(text.getTextContent());

Fix your XPath expression like this:
/testCodeIdentifier/subjectOf/annotation[code/#code='SPECIALNOTE']/text
You could then, for instance, access the CDATA content using
Node.getTextContent();
UPDATE: The above XPath seemed correct at the time I posted it. In the meantime, you have completely changed your XML code and now, the XPath would read
/testCodeIdentifier/subjectOf/code/subjectOf/annotation[code/#code='SPECIALNOTE']/text
Or, because I am guessing that this question is so messy, it's still wrong, just do:
//annotation[code/#code='SPECIALNOTE']/text

Finally i have got the answer for my question by myself.... Below code is being working for my XML to be parsed...
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//testCodeIdentifier/subjectOf/order/subjectOf/annotation/code[#code='SPECIALNOTE']/following-sibling::text/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
Thank you people who have ansewered in this post but this is a possible solution for it. Have a mark on it.

XPathExpression not evaluatng in the proper context?

I'm trying to parse some XML from the USGS.
Here's an example
The "parameterCd" parameter lists the 3 items of data I want back. I may or may not get all 3 back.
I'm doing this on an Android using the javax libraries.
In my code, I initially retrieve the 0-3 ns1:timeSeries nodes. This works fine. What I then want to do is, within the context of a single timeSeries node, retrieve the ns1:variable and ns1:values nodes.
So in my code below where I have:
expr = xpath.compile("//ns1:variable");
NodeList variableNodes = (NodeList) expr.evaluate(timeSeriesNode, XPathConstants.NODESET);
I would expect to only get back one node, since the evaluate SHOULD be happening in the context of the single timeSeriesNode that I'm passing in (according to the documentation). Instead, however, it returns all of the ns1:variable nodes for the document, however.
Am I missing something?
Here's the relevant portions...
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(new InstantaneousValuesNamespaceContext());
XPathExpression expr;
NodeList timeSeriesNodes = null;
InputStream is = new ByteArrayInputStream(sourceXml.getBytes());
try {
expr = xpath.compile("//ns1:timeSeries");
timeSeriesNodes = (NodeList) expr.evaluate(new InputSource(is), XPathConstants.NODESET);
for(int timeSeriesIndex = 0;timeSeriesIndex < timeSeriesNodes.getLength(); timeSeriesIndex++){
Node timeSeriesNode = timeSeriesNodes.item(timeSeriesIndex);
expr = xpath.compile("//ns1:variable");
NodeList variableNodes = (NodeList) expr.evaluate(timeSeriesNode, XPathConstants.NODESET);
// Problem here. I've got all the variables, not the individual one I want.
for(int variableIndex = 0; variableIndex < variableNodes.getLength(); variableIndex++){
Node variableNode = variableNodes.item(variableIndex);
expr = xpath.compile("//ns1:valueType");
NodeList valueTypeNodes = (NodeList) expr.evaluate(variableNode, XPathConstants.NODESET);
}
}
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Try changing
//ns1:variable
to
.//ns1:variable
Even though, as the docs say, the expression is evaluated within the context of the current node, // is special and (unless modified) always means 'search the whole document from the root'. By putting the . in, you force the meaning you want, 'search the whole tree from this point downwards'.

using xpath in java to go through this list?

i have an xml file that contains lots of different nodes. some in particularly are nested like this:
<emailAddresses>
<emailAddress>
<value>sambj1981#gmail.com</value>
<typeSource>WORK</typeSource>
<typeUser></typeUser>
<primary>false</primary>
</emailAddress>
<emailAddress>
<value>sambj#hotmail.co.uk</value>
<typeSource>HOME</typeSource>
<typeUser></typeUser>
<primary>true</primary>
</emailAddress>
</emailAddresses>
From the above node, what i want to do is go through each and get the values inside it(value, typeSource, typeUser etc) and put them in a POJO.
i tried to see if i can use this xpath expression "//emailAddress" but it doesnt return me the tags inside inside it. maybe i am doing it wrong. i am pretty new to using xpath.
i could do something like this:
//emailAddress/value | //emailAddress/typeSource | .. but doing that will list all elements values together if im not mistaken leaving me to work out when i have finished reading from a specific emailAddress tag and going to the next emailAddress tag.
well to sum up my needs i basically want this to be returned similar to how you would return results from a bog standard sql query that returns results in a row. i.e. if your sql query produces 10 emailAddress, it will return each emailAddress in a row and i can simply iterate over "each emailAddress" and get the appropriate value based on the colunm name or index.

No,
//emailAddress
doesn't return the tags inside, that is correct. What it does return is a NodeList/NodeSet. To actually get the values you can do something like this:
String emailpath = "//emailAddress";
String emailvalue = ".//value";
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xpath = xPathFactory.newXPath();
Document document;
public XpathStuff(String file) throws ParserConfigurationException, IOException, SAXException {
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = docFactory.newDocumentBuilder();
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
document = builder.parse(bis);
NodeList nodeList = getNodeList(document, emailpath);
for(int i = 0; i < nodeList.getLength(); i++){
System.out.println(getValue(nodeList.item(i), emailvalue));
}
bis.close();
}
public NodeList getNodeList(Document doc, String expr) {
try {
XPathExpression pathExpr = xpath.compile(expr);
return (NodeList) pathExpr.evaluate(doc, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
return null;
}
//extracts the String value for the given expression
private String getValue(Node n, String expr) {
try {
XPathExpression pathExpr = xpath.compile(expr);
return (String) pathExpr.evaluate(n,
XPathConstants.STRING);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
return null;
}
Maybe I should point out that when iterating over the Nodelist, in .//values the first dot means the current context. Without the dot you would get the first node all the time.

//emailAddress/*
will get these nodes in the document order.
It depends on how you want to iterate through the nodes. We do all our XML using XOM (http://www.xom.nu/) which is an easy reliable Java package. It's possible to write your own strategy using XOM calls.

If you use XStream you can set it up quite easily. Like so:
#XStreamAlias( "EmailAddress" )
public class EmailAddress {
#XStreamAlias()
private String value;
#XStreamAlias()
private String typeSource;
#XStreamAlias()
private String typeUser;
#XStreamAlias()
private boolean primary;
// ... the rest omitted for brevity
}
You then marshal & unmarshal quite simply like so:
XStream xstream = new XStream();
xstream.processAnnotations( EmailAddress.class );
xstream.toXML( /* Object value here */ emailAddress );
xstream.fromXML( /* String xml value here */ "" );
IDK if you have to use XPath or not, but if not I'd consider an out of the box solution like this.

I am totally aware this is not what you were asking for, but may consider using jibx. This is a tool for human-readable XML to POJO mapping.
So I believe you could generate mapping for your email structure in a quick way and let the jibx do the work for you.

How to set an empty value using XPath?

Using this xml example:
<templateitem itemid="5">
<templateitemdata>%ARN%</templateitemdata>
</templateitem>
<templateitem itemid="6">
<templateitemdata></templateitemdata>
</templateitem>
I am using XPath to get and set Node values. The code I am using to get the nodes is:
private static Node ***getNode***(Document doc, String XPathQuery) throws XPathExpressionException
{
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(XPathQuery);
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
if(nodes != null && nodes.getLength() >0)
return nodes.item(0);
throw new XPathExpressionException("No node list found for " + XPathQuery);
}
To get %ARN% value: "//templateitem[#itemid=5]/templateitemdata/text()" and with the getNode method I can get the node, and then call the getNodeValue().
Besides getting that value, I would like to set templateitemdata value for the "templateitem[#itemid=6]" since its empty. But the code I use can't get the node since its empty. The result is null.
Do you know a way get the node so I can set the value?

You simply ask for the element node itself (not for its child text node):
//templateitem[#itemid=6]/templateitemdata
getNodeValue() works on an element node as well, using text() in the XPath is completely superfluous, in both cases.

I changed the method for:
public static Node getNode(Document doc, String XPathQuery) throws XPathExpressionException
{
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(XPathQuery);
Object result = expr.evaluate(doc, XPathConstants.NODE);
Node node = (Node) result;
if(node != null )
return node;
throw new XPathExpressionException("No node list found for " + XPathQuery);
}
The query for: //templateitem[#itemid=6]/templateitemdata
And the setValue method for:
public static void setValue(final Document doc, final String XPathQuery, final String value) throws XPathExpressionException
{
Node node = getNode(doc, XPathQuery);
if(node!= null)
node.setTextContent(value);
else
throw new XPathExpressionException("No node found for " + XPathQuery);
}
I use setTextContent() instead setNodeValue().

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

XML - extracting value from specific node - java

Related

Get max Id from xml code using Xpath

DOM Parser query in JAVA

XPathExpression not evaluatng in the proper context?

using xpath in java to go through this list?

How to set an empty value using XPath?

Categories

Resources