java xpath list concatenation - java

I am using java XPathFactory to get values from a simple xml file:
<Obama>
<coolnessId>0</coolnessId>
<cars>0</cars>
<cars>1</cars>
<cars>2</cars>
</Obama>
With the xpression //Obama/coolnessId | //Obama/cars the result is:
0
0
1
2
From this result, I cannot distinguish between what is the coolnessId and what is the car id. I would need something like:
CoolnessId: 0
CarId: 0
CarId: 1
CarId: 2
With concat('c_id: ', //Obama/coolnessId,' car_id: ',//Obama/cars) I am close to the solution, but concat cannot be used for a list of values.
Unfortunately, I cannot use string-join, because it seems not be known in my xpath library. And I cannot manipulate the given xml.
What other tricks can I use to get a list of values with something like an alias?

If you select the elements rather than their text content you'll have some context:
public static void main(String[] args) throws Exception {
String xml =
"<Obama>" +
" <coolnessId>0</coolnessId>" +
" <cars>0</cars>" +
" <cars>1</cars>" +
" <cars>2</cars>" +
"</Obama>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
Document doc = factory.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8)));
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("//Obama/cars | //Obama/coolnessId");
NodeList result = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < result.getLength(); i++) {
Element item = (Element) result.item(i);
System.out.println(item.getTagName() + ": " + item.getTextContent());
}
}

Assuming you ask for the result of the evaluation as a NODELIST, your XPath expression actually returns a sequence of four element nodes, not a sequence of four strings as you suggest. If your input uses the DOM tree model, these will be returned in the form of a DOM NodeList. You can process the Node objects in this NodeList to get the names of the nodes as well as their string values.
If you switch to an XPath 3.1 engine such as Saxon, you can get the result directly as a single string using the XPath expression:
string-join((//Obama/coolnessId | //Obama/cars) ! (name() || ': ' || string()), '\n')
To invoke XPath expressions in Saxon you can use either the JAXP API (javax.xml.xpath) or Saxon's s9api interface: I would recommend s9api because it understands the richer type system of XPath 2.0 and beyond.

Related

Java DOM: getElementsByTagName is returning no elements?

I am trying to parse the XML document at http://web.mta.info/status/ServiceStatusSubway.xml and extract all the PtSituationElement elements with the following code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document subwayStatusDoc = builder.parse(new URL("http://web.mta.info/status/ServiceStatusSubway.xml").openStream());
NodeList situationList = subwayStatusDoc.getDocumentElement().getElementsByTagName("PtSituationElement");
System.out.println(situationList.item(0)); //prints null
What am I doing wrong here ?
The PtSituationElement tags contain child tags, so you need to go into those. Just printing .item(0) relies on the toString() method, and apparently it does not do a great job of explaining your nodes.
So add this to see some of the data in the child nodes:
Node item = situationList.item(0);
NodeList childNodes = item.getChildNodes();
for (int j = 0; j < childNodes.getLength(); j++) {
System.out.println(childNodes.item(j).getTextContent());
}
(I'm not sure what you want to do with the data in the xml structure, but this example shows how you can proceed with your work.)
Also, I noted that the LongDescription tags contain HTML that is not correct XML (<br clear=left> should be <br clear=left> etc). The parser could have a problem with that. It would be better if the HTML was escaped (see How to escape "&" in XML?).

Getting null values from XPath query

I have this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<iet:aw-data xmlns:iet="http://care.aw.com/IET/2007/12" class="com.aw.care.bean.resource.MessageResource">
<iet:metadata filter=""/>
<iet:message-resource>
<iet:message>some message 1</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.11</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
<iet:message-resource>
<iet:message>some message 2</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.12</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
.
.
.
.
</iet:aw-data>
Using this code below i'm getting over the data and finding what I need.
try {
FileInputStream fileIS = new FileInputStream(new File("resources\\bootstrap\\content\\MessageResources_iw_IL\\MessageResource_iw_IL.ctdata.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(fileIS);
XPath xPath = XPathFactory.newInstance().newXPath();
String query = "//*[local-name()='message-resource']//*[local-name()='code'][contains(text(), 'account')]";
NodeList nodeList = (NodeList) xPath.compile(query).evaluate(xmlDocument, XPathConstants.NODESET);
System.out.println("size= " + nodeList.getLength());
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getNodeValue());
}
}
catch (Exception e){
e.printStackTrace();
}
The issue is that i'm getting only null values while printing in the for loop, any idea why it's happened?
The code needs to return a list of nodes which have a code and message fields that contains a given parameters (same as like SQL query with two parameters with operator of AND between them)
Check the documentation:
https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html
getNodeValue() applied to an element node returns null.
Use getTextContent().
Alternatively, if you find DOM too frustrating, switch to one of the better tree models like JDOM2 or XOM.
Also, if you used an XPath 2.0 engine like Saxon, it would (a) simplify your expression to
//*:message-resource//*:code][contains(text(), 'account')]
and (b) allow you to return a sequence of strings from the XPath expression, rather than a sequence of nodes, so you wouldn't have to mess around with nodelists.
Another point: I suspect that the predicate [contains(text(), 'account')] should really be [.='account']. I'm not sure of that, but using text() instead of ".", and using contains() instead of "=", are both common mistakes.

Access data in xml as string

I am receiving a xml in string format. Is there any library to search for elements in the string?
<Version value="0"/>
<IssueDate>2017-12-15</IssueDate>
<Locale>en_US</Locale>
<RecipientAddress>
<Category>Primary</Category>
<SubCategory>0</SubCategory>
<Name>Vitsi</Name>
<Attention>Stowell Group Llc.</Attention>
<AddressLine1>511 6th St</AddressLine1>
<City>Lake Oswego</City>
<Country>United States</Country>
<PresentationValue>Lake Oswego OR 97034-2903</PresentationValue>
<State>OR</State>
<ZIPCode>97034</ZIPCode>
<ZIP4>2903</ZIP4>
</RecipientAddress>
<RecipientAddress>
<Category>Additional</Category>
<SubCategory>1</SubCategory>
<Name>Vitsi</Name>
<AddressLine1>Po Box 957</AddressLine1>
<City>Lake Oswego</City>
<Country>United States</Country>
<PresentationValue>Lake Oswego OR 97034-0104</PresentationValue>
<State>OR</State>
<ZIPCode>97034</ZIPCode>
<ZIP4>0104</ZIP4>
</RecipientAddress>
<SenderName>TMO</SenderName>
<SenderId>IL</SenderId>
<SenderAddress>
<Name>T-mobile</Name>
<AddressLine1>Po Box 790047</AddressLine1>
<City>St. Louis</City>
<PresentationValue>ST. LOUIS MO 63179-0047</PresentationValue>
<State>MO</State>
<ZIPCode>63179</ZIPCode>
.
.
.
.
I want to access the element RecipientAddress, which is a list. Is there any library to do that? Please note that what I receive is a string. It is an invoice and there will be many to process, so performance is important
Following options are available:
Convert xml string to java objects using JAXB.
Use .indexOf() in string method to retrieve specific parts of xml.
Use regular expression to retrieve specific parts of xml.
SAX/DOM/STAX parser for parsing and extraction from xml.
Xpath for fetching the specific values from xml.
You could use XPATH. Java has inbuilt support for XML querying without any thirdparty library,
Code piece would be,
String xmlInputStr = "<YOUR_XML_STRING_INPUT>"
String xpathExpressionStr = "<XPATH_EXPRESSION_STRING>"
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(xmlInputStr);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(xpathExpressionStr);
You can write your own expression string for querying. Typical example
"/RecipientAddress/Category"
Evaluate your xml against expression to retrieve list of nodes.
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
And iterate over nodes,
for (int i = 0; i < nodes.getLength(); i++) {
Node nNode = nodes.item(i);
...
}
There lot of pre-implemented api is available to convert xml to java object.
please look at that the xerces from Apache.
If you want extract only specified value the put whole in to string and use indexOf("string")

Issue with xPath results in java

I am having a problem in understanding the behavior of my below code, and until I understand it i am having a hard time trying to fix it. I have isolated the issue down to the simplest code snippet i can that displays the issue:
String sourceXML = "<root>\n"
+ "<Rule test=\"1\"/>\n"
+ "<Rule test=\"2\"/>\n"
+ "</root>";
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(sourceXML));
Document doc = db.parse(is);
NodeList ruleList = doc.getElementsByTagName("Rule");
System.out.println("Number of Items found : " + ruleList.getLength());
for (int t = 0; t < ruleList.getLength(); t++) {
if (ruleList.item(t).getNodeType() == Node.ELEMENT_NODE) {
Element ruleElement = (Element) ruleList.item(t);
String xPathToUse = "//Rule/#test";
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList ruleNodeList = (NodeList) xpath.evaluate(xPathToUse, ruleElement, XPathConstants.NODESET);
System.out.println("Found " + ruleNodeList.getLength() + " matches to xpath.....");
}
}
Generates the following output:
Number of Items found : 2
Found 2 matches to xpath.....
Found 2 matches to xpath.....
My expectation is that each xPath match would be only 1 for each iteration, as i am running the xpath on each element that i have extracted from the source XML. The output i would expect is:
Number of Items found : 2
Found 1 matches to xpath.....
Found 1 matches to xpath.....
However it appears as though when looping over the nodelist (which is correct - there are 2 in the source), that the xpath is being run on the whole source XML each time, even though i thought i extracted each node and am just running the xpath on that.
Can anybody help me with understanding what i am doing wrong here?

How to program some XPath functions using Java Design Patterns

I need your help and your experience to realize the best java code using Design Patterns.
I must write some custom XPath functions that can:
Load a DOM document (I can use a mock object);
Check the validity of an user XPath expression;
Find and return the DOM node that satisfy the user expression.
I must evaluate only absolute expressions ( /... ) that can contain the path expression " .. " and predicates, embedded in square brackets, regarding attributes or leaf nodes, for examples:
/com/university/student/../exam
/com/university/exam[#tt = 'poo']/vote
/com/university/student/number[. = '1234']
I'll use the Composite pattern for the first step, the Chain of Resonsibility for the second step and a Visitor for the third step but I am not sure that this can be the best way to do this.
Can Chain of Resonsibility be usefull to check the validity?
All suggestions are welcome, thank you in advance for any help you can provide.
Isn't it a bit ... overcomplicated?
Create a DOM object for some XML input
Compile the user input - XPath will complain if it is not valid (XPathExpressionException)
Evalute the expression with the DOM object
Sample:
// #1 load document
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
// #2 - validate expression
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = null;
try {
XPathExpression expr = xpath.compile(getExpression());
} catch (XPathExpressionException e) {
// ... handle & return <- invalid expression
}
// #3 evaluate expression
String result = expr.evaluate(doc);

Categories

Resources