can we use tokenize function in XPath
The general java code i use to process XSLT and XML files are :
XPath xPath = XPathFactory.newInstance().newXPath();
InputSource inputXML = new InputSource(new StringReader(xml));
String expression = "/root/customer/personalDetails[age=tokenize('20|30','|')]/name";
boolean evaluate1 = (boolean) xPath.compile(expression).evaluate(inputXML, XPathConstants.BOOLEAN);
XML :-
<?xml version="1.0" encoding="ISO-8859-15"?>
<root>
<customer>
<personalDetails>
<name>ABC</name>
<value>20</value>
</personalDetails>
<personalDetails>
<name>XYZ</name>
<value>21</value>
</personalDetails>
<personalDetails>
<name>PQR</name>
<value>30</value>
</personalDetails>
</customer>
</root>
Expected Response :- ABC,PQR
Yes, you can use the tokenize() function in XPath, provided your XPath processor supports XPath 2.0 or later.
For Java, the popular choice of XPath 2.0+ processor is Saxon.
You can use the JAXP API with Saxon, however, it's not really designed to work well with XPath 2.0+, so it's preferable to use Saxon's own API (called s9api).
For this particular example, you don't need tokenize(). In XPath 2.0+ you can write
[age=('20', '30')]
Related
This question already has answers here:
Java XPath: Queries with default namespace xmlns
(2 answers)
Closed 1 year ago.
I try to get element from an XML using XPath in Java.
Without a schema definition / declaration everything works fine as expected:
Example from https://www.w3schools.com/xml/schema_howto.asp
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XPath : /note/heading returns an element
After declaring an xml Schema:
<?xml version="1.0"?>
<note
xmlns="https://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://www.w3schools.com/xml note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XPath /note/heading is not working any more !!
Java example from XPathTutorial
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("inventory.xml");
//Create XPath
XPathFactory xpathfactory = XPathFactory.newInstance();
XPath xpath = xpathfactory.newXPath();
System.out.println("n//1) Get book titles written after 2001");
XPathExpression expr = xpath.compile("/note/heading/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
In XPath 1 a step like note selects elements named note in no namespace, you have put the elements into a namespace so your path with "unqualified" names (i.e. without a prefix) doesn't select any input elements. Your Java code needs to bind a prefix (e.g. w3s to the namespace URI (i.e. https://www.w3schools.com) and then use that prefix with e.g. w3s:note.
Or use an XPath 2 or 3 or XQuery implementation where you can declare a default element namespace for path expressions so that your step note would select the elements in that default element namespace you set up as e.g. https://www.w3schools.com.
If you want to ignore the namespace for XPath evaluation, due to the prefix hassle for XPath 1, you might get away with explicitly not using a namespace aware document builder but rather one that is not namespace aware.
With XPath 2 or later or XQuery 1 or later you can also use namespace wildcards like *:note.
i'm learning to parse XML files and using XPath to do querys. I don't know how to list all the Names but i don't want them repeated. Is there any option or should i do it manually?
<Return>
<ReturnData>
<Person>
<Name>Samuel</Name>
</Person>
<Person>
<Name>Samuel</Name>
</Person>
</ReturnData>
</Return>
In XPath 2.0 and higher, use distinct-values(//Name).
Java's built-in XPath processor only supports XPath 1.0, in which this query is surprisingly difficult, but there are third-party Java libraries supporting XPath 2.0, 3.0, and 3.1, notably Saxon. Saxon-HE is open source, see http://saxon.sf.net/.
I have a lot of xml files from different versions of schemas. There are certain sections/tags in these xmls that are the same.
What I want to do is locate a perticular tag and start processing that tag. The thing is that this tag may appear at different locations in the xml.
So I am looking for a xpath that will locate this node irrespective of its location. I am using Java for writing my processing code.
Following are the various falvours of the xmls
Sample 1
<nodeIWant>
<book>
<title>Harry Potter and the Philosophers Stone</title>
...
</book>
</nodeIWant>
Sample 2
<a>
<nodeIWant>
<book>
<title>Harry Potter and the Philosophers Stone</title>
...
</book>
</nodeIWant>
</a>
Sample 3
<b>
<nodeIWant>
<book>
<title>Harry Potter and the Philosophers Stone</title>
...
</book>
</nodeIWant>
</b>
In the above xmls I want to use the same xpath to locate the node 'nodeIWant'.
The Java code I am using is the following
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = factory.newDocumentBuilder();
Document modelDoc = docBuilder.parse(args[0]);
XPath xPath = XPathFactory.newInstance().newXPath();
System.out.println(xPath.evaluate("//nodeIWant", modelDoc.getDocumentElement(), XPathConstants.NODE));
This prints out a null.
Final Edit
The answer by Mathias Müller works for these xml files. I am actually trying to query the .emx files in Rational Software Architect. I was trying to avaoid using these for examples. (Please don't start talking about BIRT and using the eclipse uml APIs etc... I have tried these and they do not give me what I want.)
The structure of the files is the following
<?xml version="1.0" encoding="UTF-8"?>
<!--xtools2_universal_type_manager-->
<?com.ibm.xtools.emf.core.signature <signature id="com.ibm.xtools.uml.msl.model" version="7.0.0"><feature description="" name="com.ibm.xtools.ruml.feature" url="" version="7.0.0"/></signature>?>
<?com.ibm.xtools.emf.core.signature <signature id="com.ibm.xtools.mmi.ui.signatures.diagram" version="7.0.0"><feature description="" name="Rational Modeling Platform (com.ibm.xtools.rmp)" url="" version="7.0.0"/></signature>?>
<xmi:XMI version="2.0" xmlns:Default="http:///schemas/Default/_fNm3AAqoEd6-N_NOT9vsCA/2" xmlns:ecore="http://www.eclipse.org/emf/2002/Ecore" xmlns:uml="http://www.eclipse.org/uml2/3.0.0/UML" xmlns:xmi="http://www.omg.org/XMI" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http:///schemas/Default/_fNm3AAqoEd6-N_NOT9vsCA/2 pathmap://UML2_MSL_PROFILES/Default.epx#_fNwoAAqoEd6-N_NOT9vsCA?Default/Default?">
<uml:Model name="A" xmi:id="_4lzSsMywEeGAuoBpYhfj6Q">
<!-- Lot of other stuff -->
</uml:Model>
<xmi:XMI>
The other file is
<?xml version="1.0" encoding="UTF-8"?>
<!--xtools2_universal_type_manager-->
<?com.ibm.xtools.emf.core.signature <signature id="com.ibm.xtools.uml.msl.model" version="7.0.0"><feature description="" name="com.ibm.xtools.ruml.feature" url="" version="7.0.0"/></signature>?>
<?com.ibm.xtools.emf.core.signature <signature id="com.ibm.xtools.mmi.ui.signatures.diagram" version="7.0.0"><feature description="" name="Rational Modeling Platform (com.ibm.xtools.rmp)" url="" version="7.0.0"/></signature>?>
<uml:Model xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:ecore="http://www.eclipse.org/emf/2002/Ecore" xmlns:uml="http://www.eclipse.org/uml2/3.0.0/UML" xmi:id="_4lzSsMywEeGAuoBpYhfj6Q" name="A">
<!-- Lot of other stuff -->
</uml:Model>
Shouldn't the xpath of '//Model' work for these two samples as well?
You can use the xPath 'axis' //. This searches in the file for your node and doesn't care about the parent-nodes. So in your example you can use:
//nodeIWant
Not very familiar with DocumentBuilder, but perhaps you need to compile an XPath expression before evaluating it against a document? It seems it's not XPath expressions that are evaluated, XML documents are.
String expression = "//nodeIWant";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(modelDoc, XPathConstants.NODESET);
Or, if there is just one of those elements and you'd like to print its string value:
String expression = "//nodeIWant";
System.out.println(xPath.compile(expression).evaluate(modelDoc));
EDIT: You edited your question and revealed the actual XML you are evaluating path expressions against. Those new documents have namespaces that you need to take into account in XPath expressions.
//nodeIWant will never find a node if it is actually in a namespace. To find the Model node in your new documents, you'd have to use
//*[local-name() = 'Model']
Since the example you provided contains more than just a String in the <nodeIWant> elements you probably can benefit from using an object oriented approach combined with xpath. With data projection (disclosure: I'm affiliated with that project) it's possible to do this:
public class DataProjection {
public interface Book {
#XBRead("./title")
String getTitle();
//... more getter or setter methods
}
public static void main(String[] args) {
// Print all books in all <nodeIWant> elements of
for (String file : new String[] { "a.xml", "b.xml", "c.xml" }) {
List<Book> books = new XBProjector().io().file(file).evalXPath("//nodeIWant/book").asListOf(Book.class);
for (Book book : books) {
System.out.println(book.getTitle());
}
}
}
}
You can define one or more views (called projection interfaces) to the XML data and use XPath to connect the data to java objects implementing these interfaces. This helps a lot in structuring your code and have it reuseable for similar XML files.
i have response structure that i want to parse in Java. Can anyone help me with this?
<message_response xmlns="">
<action name="GETCIL">
<param name="bookingNote" value="" require="" read-only=""><![CDATA[bookingNote]]></param>
<param name="CarrierLinkType" value="" require="" read-only=""><![CDATA[True]]></param>
<param name="Carrier" value="" require="" read-only=""><![CDATA[SK185]]></param>
<param_list name="ViaAddressList" id="GETCIL">
<value>
<param_list name="ViaAddressId" id="ViaAddressList">
<value><![CDATA[877765050_5511]]></value>
</param_list>
<param_list name="AddressDate" id="ViaAddressList">
<value><![CDATA[10/12/2010]]></value>
</param_list>
<param_list name="AddressTime" id="ViaAddressList">
<value><![CDATA[12:12]]></value>
</param_list>
</value>
</param_list>
</action>
</message_response>
The easiest way to extract specific values from an XML document (as opposed to parsing the complete document with SAX) is to use XPath as follows:
//1. load the document into memory.
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
//2. Create an XPath.
XPath xpath = XPathFactory.newInstance().newXPath();
//3. Evaluate the xpath expression.
String actionName = xpath.evaluate("/message_response/action/#name", documentBuilder.parse(xmlFile));
There's not much more to it other than the XPath.evaluate method is overloaded in order to allow nodes and node lists to be returned (see javax.xml.xpath.XPathConstants for the types).
Then you just need to read-up on the xpath syntax (http://www.w3schools.com/xpath/xpath_syntax.asp).
Why the CDATA sections around the data?
You can use SAX or DOM to parse XML.
There are also libraries wrapping SAX and DOM parsers that make your life easier for common tasks. Two that come to mind for Java are JDOM and DOM4J. Google for them - there are tutorials and examples available that will show you what you need to know.
I've read XPath - how to select text and thought I had the general idea. But, as always, XPath rears up, hisses at me, and scuttles off to find the nearest bacteria-infested urinal to drown in.
I have a JPA orm.xml file. It looks like this:
<?xml version="1.0" encoding="UTF-8" ?>
<entity-mappings xmlns="http://java.sun.com/xml/ns/persistence/orm"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/persistence/orm
http://java.sun.com/xml/ns/persistence/orm_2_0.xsd"
version="2.0">
<persistence-unit-metadata>
<persistence-unit-defaults>
<schema>test</schema>
<catalog>test</catalog>
</persistence-unit-defaults>
</persistence-unit-metadata>
</entity-mappings>
The following XPath expression should, I would think, select the text from the <schema> element:
/entity-mappings/persistence-unit-metadata/persistence-unit-defaults/schema/text()
But using Java's XPath implementation, it does not.
More specifically, the following code fails (using JUnit asserts) on the last line. The value of the text variable is the empty string.
// Find the file: URL to the orm.xml I mentioned above.
final URL ormUrl = Thread.currentThread().getContextClassLoader().getResource("META-INF/orm.xml");
assertNotNull(ormUrl);
final XPathFactory xpf = XPathFactory.newInstance();
assertNotNull(xpf);
final XPath xpath = xpf.newXPath();
assertNotNull(xpath);
final XPathExpression expression = xpath.compile("/entity-mappings/persistence-unit-metadata/persistence-unit-defaults/schema/text()");
assertNotNull(expression);
final String text = expression.evaluate(new InputSource(ormUrl.openStream()));
assertEquals("test", text);
This seems to cast into doubt what little understanding I had of XPath expressions to begin with. Flailing around, I then wanted to see if a simple "/" would select the root element. Mercifully, this returned a non-null NodeList, but the NodeList was empty. I really don't want to hunt the authors of the Java XPath support down and string them up, but it's getting awfully difficult not to follow that course of action.
Please help me shoot XPath in the head once and for all. Thanks.
The problem is that the XML declares a default namespace
xmlns="http://java.sun.com/xml/ns/persistence/orm"
while in your XPath expression you have not provided a corresponding namespace context. See this link for details on how to work with namespace contexts. There's a lot of detail there, but in summary you have to write your own implementation of javax.xml.namespace.NamespaceContext that allows the XPath processor to map namespace prefixes to URIs. In your case you must provide a mapping for the default namespace to the appropriate URI.