I want to make an XPath query on this XML file (excerpt shown):
<?xml version="1.0" encoding="UTF-8"?>
<!-- MetaDataAPI generated on: Friday, May 25, 2007 3:26:31 PM CEST -->
<Component xmlns="http://xml.sap.com/2002/10/metamodel/webdynpro" xmlns:IDX="urn:sap.com:WebDynpro.Component:2.0" mmRelease="6.30" mmVersion="2.0" mmTimestamp="1180099591892" name="MassimaleContr" package="com.bi.massimalecontr" masterLanguage="it">
...
<Component.UsedModels>
<Core.Reference package="com.test.test" name="MasterModel" type="Model"/>
<Core.Reference package="com.test.massimalecontr" name="MassimaleModel" type="Model"/>
<Core.Reference package="com.test.test" name="TravelModel" type="Model"/>
</Component.UsedModels>
...
I'm using this snippet of code:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document document = builder.parse(new File("E:\\Test branch\\test.wdcomponent"));
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(new NamespaceContext() {
...(omitted)
System.out.println(xpath.evaluate(
"//d:Component/d:Component.UsedModels/d:Core.Reference/#name",
document));
What I'm expecting to get:
MasterModel
MassimaleModel
TravelModel
What I'm getting:
MasterModel
It seems that only the first element is returned. How can I get all the occurrences that matches my query?
You'll get a item of type NodeList
XPathExpression expr = xpath.compile("//Core.Reference");
NodeList list= (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < list.getLength(); i++) {
Node node = list.item(i);
System.out.println(node.getTextContent());
// work with node
See How to read XML using XPath in Java
As per that example, If you first compile the XPath expression then execute it, specifying that you want a NodeSet back you should get the result you want.
Related
I am receiving a xml in string format. Is there any library to search for elements in the string?
<Version value="0"/>
<IssueDate>2017-12-15</IssueDate>
<Locale>en_US</Locale>
<RecipientAddress>
<Category>Primary</Category>
<SubCategory>0</SubCategory>
<Name>Vitsi</Name>
<Attention>Stowell Group Llc.</Attention>
<AddressLine1>511 6th St</AddressLine1>
<City>Lake Oswego</City>
<Country>United States</Country>
<PresentationValue>Lake Oswego OR 97034-2903</PresentationValue>
<State>OR</State>
<ZIPCode>97034</ZIPCode>
<ZIP4>2903</ZIP4>
</RecipientAddress>
<RecipientAddress>
<Category>Additional</Category>
<SubCategory>1</SubCategory>
<Name>Vitsi</Name>
<AddressLine1>Po Box 957</AddressLine1>
<City>Lake Oswego</City>
<Country>United States</Country>
<PresentationValue>Lake Oswego OR 97034-0104</PresentationValue>
<State>OR</State>
<ZIPCode>97034</ZIPCode>
<ZIP4>0104</ZIP4>
</RecipientAddress>
<SenderName>TMO</SenderName>
<SenderId>IL</SenderId>
<SenderAddress>
<Name>T-mobile</Name>
<AddressLine1>Po Box 790047</AddressLine1>
<City>St. Louis</City>
<PresentationValue>ST. LOUIS MO 63179-0047</PresentationValue>
<State>MO</State>
<ZIPCode>63179</ZIPCode>
.
.
.
.
I want to access the element RecipientAddress, which is a list. Is there any library to do that? Please note that what I receive is a string. It is an invoice and there will be many to process, so performance is important
Following options are available:
Convert xml string to java objects using JAXB.
Use .indexOf() in string method to retrieve specific parts of xml.
Use regular expression to retrieve specific parts of xml.
SAX/DOM/STAX parser for parsing and extraction from xml.
Xpath for fetching the specific values from xml.
You could use XPATH. Java has inbuilt support for XML querying without any thirdparty library,
Code piece would be,
String xmlInputStr = "<YOUR_XML_STRING_INPUT>"
String xpathExpressionStr = "<XPATH_EXPRESSION_STRING>"
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(xmlInputStr);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(xpathExpressionStr);
You can write your own expression string for querying. Typical example
"/RecipientAddress/Category"
Evaluate your xml against expression to retrieve list of nodes.
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
And iterate over nodes,
for (int i = 0; i < nodes.getLength(); i++) {
Node nNode = nodes.item(i);
...
}
There lot of pre-implemented api is available to convert xml to java object.
please look at that the xerces from Apache.
If you want extract only specified value the put whole in to string and use indexOf("string")
I have an XML document:
<response>
<result>
<phone>1233</phone>
<sys_id>asweyu4</sys_id>
<link>rft45fgd</link>
<!-- Many more in result -->
</result>
<!-- Many more result nodes -->
</response>
The XML structure is unknown. I am getting XPath for attributes from user.
e.g. inputs are strings like:
//response/result/sys_id , //response/result/phone
How can I get these node values for whole XML document by evaluating XPath?
I referred this but my xpath is as shown above i.e it does not have * or text() format.
The xpath evaluator works perfectly fine with my input format, so is there any way I can achieve the same in java?
Thank you!
It's difficult without seeing your code... I'd just evaluate as a NodeList and then call getTextContent() on each node in the result list...
String input = "<response><result><phone>1233</phone><sys_id>asweyu4</sys_id><link>rft45fgd</link></result><result><phone>1233</phone><sys_id>another-sysid</sys_id><link>another-link</link></result></response>";
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
.parse(new ByteArrayInputStream(input.getBytes("UTF-8")));
XPath path = XPathFactory.newInstance().newXPath();
NodeList node = (NodeList) path.compile("//response/result/sys_id").evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < node.getLength(); i++) {
System.out.println(node.item(i).getTextContent());
}
Output
asweyu4
another-sysid
i parse XML document in java with:
doc = DocumentBuilderFactory
.newInstance()
.newDocumentBuilder()
.parse(new URL(url).openStream());
work, but is possible to parse with some filter? for example my XML file have one attribute priority, is possible to parse with filter for example priority>8 ?
So in the doc have only element with priority > 8.
Example xml:
<url>
<loc>http</loc>
<lastmod>2015-02-26</lastmod>
<title>Hello</titolo>
<priority>1.0</priority>
</url>
...
Thanks
For the following sample input file named urls.xml
<root>
<url>
<loc>http</loc>
<lastmod>2015-02-26</lastmod>
<title>Hello</title>
<priority>1.0</priority>
</url>
<url>
<loc>http</loc>
<lastmod>2015-02-26</lastmod>
<title>Hello</title>
<priority>7.0</priority>
</url>
<url>
<loc>http</loc>
<lastmod>2015-02-26</lastmod>
<title>Hello</title>
<priority>10.0</priority>
</url>
</root>
You first create the full Document tree as usual
Document document = DocumentBuilderFactory
.newInstance()
.newDocumentBuilder()
.parse(new File("urls.xml"));
Then run the XPath query that selects all the Nodes above a certain priority
XPathExpression expr = XPathFactory.newInstance()
.newXPath().compile("//url[priority > 5]");
NodeList urls = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
If you want to serialize the results to another xml file, create a new Document first.
Document result = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().newDocument();
Node root = result.createElement("results");
result.appendChild(root);
Then append the filtered url Nodes as
for (int i = 0; i < urls.getLength(); i++) {
Node copy = result.importNode(urls.item(i), true);
root.appendChild(result.createTextNode("\n\t"));
root.appendChild(copy);
}
root.appendChild(result.createTextNode("\n"));
Now, all you need to do is to serialize the new Document to a String and write that out to a file. Here's I'm just printing it out on to the console.
System.out.println(
((DOMImplementationLS) result.getImplementation())
.createLSSerializer().writeToString(result));
Output:
<?xml version="1.0" encoding="UTF-16"?>
<results>
<url>
<loc>http</loc>
<lastmod>2015-02-26</lastmod>
<title>Hello</title>
<priority>7.0</priority>
</url>
<url>
<loc>http</loc>
<lastmod>2015-02-26</lastmod>
<title>Hello</title>
<priority>10.0</priority>
</url>
</results>
You should use XPath to find the elements you require:
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile([your xpath here]);
Then...
NodeList nl = (NodeList) expr.evaluate(doc);
... to get the nodes you require. You can use...
for(Node node in nl) {
if (node.getNodeType() == Node.ELEMENT_NODE) {
}
}
... to pull out only the genuine elements.
Of course, you'll need to also build up a basic XPath expression to find the nodes you require.
I have a xml file which looks like:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<alarm-response-list xmlns="http://www.ca.com/spectrum/restful/schema/response"
error="EndOfResults" throttle="277" total-alarms="288">
<alarm-responses>
<alarm id="53689bf8-6cc8-1003-0060-008010186429">
<attribute id="0x11f4a" error="NoSuchAttribute" />
<attribute id="0x12b4c">UPS DIAGNOSTIC TEST FAILED</attribute>
<attribute id="0x10b5a">IDG860237, SL3-PL4, US, SapNr=70195637,</attribute>
</alarm>
<alarm id="536b8c9a-28b3-1008-0060-008010186429">
<attribute id="0x11f4a" error="NoSuchAttribute" />
<attribute id="0x12b4c">DEVICE IN MAINTENANCE MODE</attribute>
<attribute id="0x10b5a">IDG860237, SL3-PL4, US, SapNr=70195637,</attribute>
</alarm>
</alarm-responses>
</alarm-response-list>
There a lot of these alarms. Now I want save for every alarm tag the attribute with the id = 0x10b5a in a String. But I haven't a great clue. In my way it doesn't do it. I get only showed the expression.
My idea:
FileInputStream file = new FileInputStream(
new File(
"alarms.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
System.out.println("*************************");
String expression = "/alarm-responses/alarm/attribute[#id='0x10b5a'] ";
System.out.println(expression);
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(
xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getFirstChild()
.getNodeValue());
}
There are several different problems here that are interacting to mean that your XPath expression doesn't match anything. Firstly the alarm-responses element isn't the root of the document - you need an extra step on the front of the path to select the alarm-response-list element. But more importantly you have namespace issues.
XPath only works when the XML has been parsed with namespaces enabled, which for some reason is not the default for DocumentBuilderFactory. You need to enable namespaces before you do newDocumentBuilder.
Now your XML document has xmlns="http://www.ca.com/spectrum/restful/schema/response", which puts all the elements in this namespace, but unprefixed node names in an XPath expression always refer to nodes that are not in a namespace. In order to match namespaced nodes you need to bind a prefix to the namespace URI and then use prefixed names in the path.
For javax.xml.xpath this is done using a NamespaceContext, but annoyingly there is no default implementation of this interface available by default in the Java core library. There is a SimpleNamespaceContext implementation available as part of Spring, or it's fairly simple to write your own. Using the Spring class:
DocumentBuilderFactory builderFactory = DocumentBuilderFactory
.newInstance();
// enable namespaces
builderFactory.setNamespaceAware(true);
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
// Set up the namespace context
SimpleNamespaceContext ctx = new SimpleNamespaceContext();
ctx.bindNamespaceUri("ca", "http://www.ca.com/spectrum/restful/schema/response");
xPath.setNamespaceContext(ctx);
System.out.println("*************************");
// corrected expression
String expression = "/ca:alarm-response-list/ca:alarm-responses/ca:alarm/ca:attribute[#id='0x10b5a']";
System.out.println(expression);
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(
xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getTextContent());
}
Note also how I'm using getTextContent() to get the text under each matched element. The getNodeValue() method always returns null for element nodes.
I'm trying to read an xml file, for example: http://www1.skysports.com/feeds/11095/news.xml
I need to be able to getTextContent() for all the titles, descriptions etc that are children of <item> tags. There is a <title> tag that is not a child of an <item> tag that i dont want to getTextContent() for.
I've set up my XML reader so that i have:
Document doc = dbuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
String Title = document.getElementsByTagName("title").item(0).getTextContent();
but this method picks up the <title> that isnt a child of <item>
I could just change the item(0) to item(1) but I need this algorithm to work with various XML files that wont necessarily have the initial <title> without the <item> parent.
how can I just select those <title>s that are children of <item>s?
Use XPath instead. Makes it all a lot easier:
XPathFactory xpf = XPathFactory.newInstance();
XPath xp = xpf.newXPath();
NodeList nl = (NodeList) xp.evaluate("//item/title/text()", doc,
XPathConstants.NODESET);
for (int i = 0; i < nl.getLength(); ++i) {
System.out.println(nl.item(i).getNodeValue());
}