XPath select element with an attribute - java

I have a xml file which looks like:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<alarm-response-list xmlns="http://www.ca.com/spectrum/restful/schema/response"
error="EndOfResults" throttle="277" total-alarms="288">
<alarm-responses>
<alarm id="53689bf8-6cc8-1003-0060-008010186429">
<attribute id="0x11f4a" error="NoSuchAttribute" />
<attribute id="0x12b4c">UPS DIAGNOSTIC TEST FAILED</attribute>
<attribute id="0x10b5a">IDG860237, SL3-PL4, US, SapNr=70195637,</attribute>
</alarm>
<alarm id="536b8c9a-28b3-1008-0060-008010186429">
<attribute id="0x11f4a" error="NoSuchAttribute" />
<attribute id="0x12b4c">DEVICE IN MAINTENANCE MODE</attribute>
<attribute id="0x10b5a">IDG860237, SL3-PL4, US, SapNr=70195637,</attribute>
</alarm>
</alarm-responses>
</alarm-response-list>
There a lot of these alarms. Now I want save for every alarm tag the attribute with the id = 0x10b5a in a String. But I haven't a great clue. In my way it doesn't do it. I get only showed the expression.
My idea:
FileInputStream file = new FileInputStream(
new File(
"alarms.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
System.out.println("*************************");
String expression = "/alarm-responses/alarm/attribute[#id='0x10b5a'] ";
System.out.println(expression);
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(
xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getFirstChild()
.getNodeValue());
}

There are several different problems here that are interacting to mean that your XPath expression doesn't match anything. Firstly the alarm-responses element isn't the root of the document - you need an extra step on the front of the path to select the alarm-response-list element. But more importantly you have namespace issues.
XPath only works when the XML has been parsed with namespaces enabled, which for some reason is not the default for DocumentBuilderFactory. You need to enable namespaces before you do newDocumentBuilder.
Now your XML document has xmlns="http://www.ca.com/spectrum/restful/schema/response", which puts all the elements in this namespace, but unprefixed node names in an XPath expression always refer to nodes that are not in a namespace. In order to match namespaced nodes you need to bind a prefix to the namespace URI and then use prefixed names in the path.
For javax.xml.xpath this is done using a NamespaceContext, but annoyingly there is no default implementation of this interface available by default in the Java core library. There is a SimpleNamespaceContext implementation available as part of Spring, or it's fairly simple to write your own. Using the Spring class:
DocumentBuilderFactory builderFactory = DocumentBuilderFactory
.newInstance();
// enable namespaces
builderFactory.setNamespaceAware(true);
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
// Set up the namespace context
SimpleNamespaceContext ctx = new SimpleNamespaceContext();
ctx.bindNamespaceUri("ca", "http://www.ca.com/spectrum/restful/schema/response");
xPath.setNamespaceContext(ctx);
System.out.println("*************************");
// corrected expression
String expression = "/ca:alarm-response-list/ca:alarm-responses/ca:alarm/ca:attribute[#id='0x10b5a']";
System.out.println(expression);
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(
xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getTextContent());
}
Note also how I'm using getTextContent() to get the text under each matched element. The getNodeValue() method always returns null for element nodes.

Related

Getting null values from XPath query

I have this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<iet:aw-data xmlns:iet="http://care.aw.com/IET/2007/12" class="com.aw.care.bean.resource.MessageResource">
<iet:metadata filter=""/>
<iet:message-resource>
<iet:message>some message 1</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.11</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
<iet:message-resource>
<iet:message>some message 2</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.12</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
.
.
.
.
</iet:aw-data>
Using this code below i'm getting over the data and finding what I need.
try {
FileInputStream fileIS = new FileInputStream(new File("resources\\bootstrap\\content\\MessageResources_iw_IL\\MessageResource_iw_IL.ctdata.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(fileIS);
XPath xPath = XPathFactory.newInstance().newXPath();
String query = "//*[local-name()='message-resource']//*[local-name()='code'][contains(text(), 'account')]";
NodeList nodeList = (NodeList) xPath.compile(query).evaluate(xmlDocument, XPathConstants.NODESET);
System.out.println("size= " + nodeList.getLength());
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getNodeValue());
}
}
catch (Exception e){
e.printStackTrace();
}
The issue is that i'm getting only null values while printing in the for loop, any idea why it's happened?
The code needs to return a list of nodes which have a code and message fields that contains a given parameters (same as like SQL query with two parameters with operator of AND between them)
Check the documentation:
https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html
getNodeValue() applied to an element node returns null.
Use getTextContent().
Alternatively, if you find DOM too frustrating, switch to one of the better tree models like JDOM2 or XOM.
Also, if you used an XPath 2.0 engine like Saxon, it would (a) simplify your expression to
//*:message-resource//*:code][contains(text(), 'account')]
and (b) allow you to return a sequence of strings from the XPath expression, rather than a sequence of nodes, so you wouldn't have to mess around with nodelists.
Another point: I suspect that the predicate [contains(text(), 'account')] should really be [.='account']. I'm not sure of that, but using text() instead of ".", and using contains() instead of "=", are both common mistakes.

Get element value from XML with XPath

I have XML file like this:
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents- xmlns:xades="http://uri.etsi.org/01903/v1.3.2#" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 UBL-Invoice-2.1.xsd">
<cac:AccountingSupplierParty>
<cac:Party>
<cac:PartyIdentification>
<cbc:ID schemeID="schema1">123231123</cbc:ID>
</cac:PartyIdentification>
<cac:PartyIdentification>
<cbc:ID schemeID="schema2">2323232323</cbc:ID>
</cac:PartyIdentification>
<cac:PartyIdentification>
<cbc:ID schemeID="schema3">4442424</cbc:ID>
</cac:PartyIdentification>
<cac:PostalAddress>
<cbc:CityName>İstanbul</cbc:CityName>
<cac:Country>
<cbc:Name>Turkey</cbc:Name>
</cac:Country>
</cac:PostalAddress>
</cac:Party>
</cac:AccountingSupplierParty>
</Invoice>
I want to access schemeID="schema=2" value. I try XPath and document.getElementsByTagName. I can access elements with document.getElementsByTagName, since is multiple I can't access the element I want. When I try to with XPath, I can't access any elements from XML.
Here is my XPath implementation:
try {
String decoded = new
String(DatatypeConverter.parseBase64Binary(binaryXmlData));
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(decoded));
Document doc = db.parse(is);
String expression = "/Invoice/cac:AccountingSupplierParty/cac:Party/cac:PartyIdentification/cbc:ID#[schemaID='schema2']/text()";
String schema2 = (String) xPath.compile(expression).evaluate(doc, XPathConstants.STRING);
System.out.println(schema2);
//schema2 is null
//Above this code block returns correct value
NodeList nl = doc.getElementsByTagName("cbc:CityName");
System.out.println(nl.item(0).getTextContent());
} catch () {
}
binaryXmlData is source of my XML. First, I convert base64binary data to xml. Am I doing to convertion wrong or my xpath implementation is wrong ?
There are many problems with your code and your XML, including:
Your XML is not well-formed. The closing quote of the cbc
namespace prefix is missing.
Your Java code never defines a NamespaceContext.
See also How does XPath deal with XML namespaces?

Access data in xml as string

I am receiving a xml in string format. Is there any library to search for elements in the string?
<Version value="0"/>
<IssueDate>2017-12-15</IssueDate>
<Locale>en_US</Locale>
<RecipientAddress>
<Category>Primary</Category>
<SubCategory>0</SubCategory>
<Name>Vitsi</Name>
<Attention>Stowell Group Llc.</Attention>
<AddressLine1>511 6th St</AddressLine1>
<City>Lake Oswego</City>
<Country>United States</Country>
<PresentationValue>Lake Oswego OR 97034-2903</PresentationValue>
<State>OR</State>
<ZIPCode>97034</ZIPCode>
<ZIP4>2903</ZIP4>
</RecipientAddress>
<RecipientAddress>
<Category>Additional</Category>
<SubCategory>1</SubCategory>
<Name>Vitsi</Name>
<AddressLine1>Po Box 957</AddressLine1>
<City>Lake Oswego</City>
<Country>United States</Country>
<PresentationValue>Lake Oswego OR 97034-0104</PresentationValue>
<State>OR</State>
<ZIPCode>97034</ZIPCode>
<ZIP4>0104</ZIP4>
</RecipientAddress>
<SenderName>TMO</SenderName>
<SenderId>IL</SenderId>
<SenderAddress>
<Name>T-mobile</Name>
<AddressLine1>Po Box 790047</AddressLine1>
<City>St. Louis</City>
<PresentationValue>ST. LOUIS MO 63179-0047</PresentationValue>
<State>MO</State>
<ZIPCode>63179</ZIPCode>
.
.
.
.
I want to access the element RecipientAddress, which is a list. Is there any library to do that? Please note that what I receive is a string. It is an invoice and there will be many to process, so performance is important
Following options are available:
Convert xml string to java objects using JAXB.
Use .indexOf() in string method to retrieve specific parts of xml.
Use regular expression to retrieve specific parts of xml.
SAX/DOM/STAX parser for parsing and extraction from xml.
Xpath for fetching the specific values from xml.
You could use XPATH. Java has inbuilt support for XML querying without any thirdparty library,
Code piece would be,
String xmlInputStr = "<YOUR_XML_STRING_INPUT>"
String xpathExpressionStr = "<XPATH_EXPRESSION_STRING>"
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(xmlInputStr);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(xpathExpressionStr);
You can write your own expression string for querying. Typical example
"/RecipientAddress/Category"
Evaluate your xml against expression to retrieve list of nodes.
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
And iterate over nodes,
for (int i = 0; i < nodes.getLength(); i++) {
Node nNode = nodes.item(i);
...
}
There lot of pre-implemented api is available to convert xml to java object.
please look at that the xerces from Apache.
If you want extract only specified value the put whole in to string and use indexOf("string")

Why getting null node value while parsing XML

While parsing the below XML .First url-malformed-exception was coming while parsing so in the code instead of giving the xml String i used this code
Document doc=dBuilder.parse(newInputSource(newByteArrayInputStream(xmlResponse.getBytes("utf-8"))));
according to this link
java.net.MalformedURLException: no protocol
now i am getting the node value as null .How can i overcome this .In the code in for loop i have mentioned where the null value for node is coming
i am using following code:
try {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new InputSource(new ByteArrayInputStream(xmlResponse.getBytes("utf-8"))));
//read this - https://stackoverflow.com/questions/13786607/normalization-in-dom-parsing-with-java-how-does-it-work
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
XPath xPath = XPathFactory.newInstance().newXPath()
String expression = "/GetMatchingProductForIdResponse/GetMatchingProductForIdResult/Products/Product"
System.out.println(expression)
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(doc, XPathConstants.NODESET)
System.out.println("the size will be of the node list ${nodeList.getLength()}");
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getNodeValue()+"the value coming will be "); // here i am getting value null for each node
}
} catch (Exception e) {
e.printStackTrace(System.out);
}
to parse the XML:
<?xml version="1.0"?>
<GetMatchingProductForIdResponse xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01">
<GetMatchingProductForIdResult Id="H5-9OSH-9NZ7" IdType="SellerSKU" status="Success">
<Products xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01" xmlns:ns2="http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd">
<Product>
<Identifiers>
<MarketplaceASIN>
<MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
<ASIN>B004FQLAH2</ASIN>
</MarketplaceASIN>
</Identifiers>
<AttributeSets>
<ns2:ItemAttributes xml:lang="en-US">
<ns2:Binding>Office Product</ns2:Binding>
<ns2:Brand>Konica-Minolta</ns2:Brand>
<ns2:Color>Y</ns2:Color>
<ns2:CPUSpeed Units="MHz">200</ns2:CPUSpeed>
<ns2:Department>Printers</ns2:Department>
<ns2:Feature>Amp Up your Output - The magicolor 3730DN business color laser printer outputs at speeds up to 25 ppm in both color and B&W which means you can keep up in just about any business environment.</ns2:Feature>
<ns2:Feature>Unparalleled Image Quality - High resolution 2400 (equivalent) x 600 dpi printing for great color and clarity in both images and text.</ns2:Feature>
<ns2:Feature>Happy Planet, Outstanding Printing - Simitri HD Toner with Biomass allows for outstanding printing with the environment in mind.</ns2:Feature>
<ns2:Feature>Connect quicker - Why wait? Standard Ethernet and high-speed USB 2.0 gets you connected faster than ever before.Specifications</ns2:Feature>
<ns2:Feature>Type - Full-Color Laser Printer</ns2:Feature>
<ns2:ItemDimensions>
<ns2:Height Units="inches">13.62</ns2:Height>
<ns2:Length Units="inches">20.47</ns2:Length>
<ns2:Width Units="inches">16.50</ns2:Width>
<ns2:Weight Units="pounds">56.22</ns2:Weight>
</ns2:ItemDimensions>
<ns2:IsAutographed>false</ns2:IsAutographed>
<ns2:IsMemorabilia>false</ns2:IsMemorabilia>
<ns2:Label>Konica</ns2:Label>
<ns2:ListPrice>
<ns2:Amount>449.00</ns2:Amount>
<ns2:CurrencyCode>USD</ns2:CurrencyCode>
</ns2:ListPrice>
<ns2:Manufacturer>Konica</ns2:Manufacturer>
<ns2:Model>A0VD017</ns2:Model>
<ns2:NumberOfItems>1</ns2:NumberOfItems>
<ns2:OperatingSystem>Windows XP, Vista, 7</ns2:OperatingSystem>
<ns2:OperatingSystem>Mac X 10.2.8, 10.6+</ns2:OperatingSystem>
<ns2:PackageDimensions>
<ns2:Height Units="inches">19.00</ns2:Height>
<ns2:Length Units="inches">24.20</ns2:Length>
<ns2:Width Units="inches">22.00</ns2:Width>
<ns2:Weight Units="pounds">65.30</ns2:Weight>
</ns2:PackageDimensions>
<ns2:PackageQuantity>1</ns2:PackageQuantity>
<ns2:PartNumber>A0VD017</ns2:PartNumber>
<ns2:ProductGroup>CE</ns2:ProductGroup>
<ns2:ProductTypeName>PRINTER</ns2:ProductTypeName>
<ns2:Publisher>Konica</ns2:Publisher>
<ns2:SmallImage>
<ns2:URL>http://ecx.images-amazon.com/images/I/21qN3BU-BHL._SL75_.jpg</ns2:URL>
<ns2:Height Units="pixels">75</ns2:Height>
<ns2:Width Units="pixels">75</ns2:Width>
</ns2:SmallImage>
<ns2:Studio>Konica</ns2:Studio>
<ns2:Title>Konica Minolta Magicolor 3730DN Color Laser Printer 24PPM 2400X600DPI ENET USB 2.0</ns2:Title>
</ns2:ItemAttributes>
</AttributeSets>
<Relationships/>
<SalesRankings/>
</Product>
</Products>
</GetMatchingProductForIdResult>
<ResponseMetadata>
<RequestId>0b508338-3afe-4178-adc4-60c9c8448987</RequestId>
</ResponseMetadata>
</GetMatchingProductForIdResponse>
The getNodeValue method in the DOM is defined to always return null for element nodes (see the table at the top of the JavaDoc page for org.w3c.dom.Node for details). If you want the text inside the element then you should use getTextContent() instead.
You've added a second question in a comment to this answer asking how you can use an XPath to search for nodes that have a namespace prefix such as ns2:. The way XPath 1.0 handles namespaces is that unprefixed names always refer to nodes that are not in a namespace, and if you want to reference namespaced nodes then you have to provide a binding of namespace URIs to prefixes (which in javax.xml.xpath is the job of a NamespaceContext) and then use those prefixes in the expressions. The prefixes you use in the expression need not be the same ones as the original document used, as long as they bind to the right URIs.
Thus the original XPath you were using:
/GetMatchingProductForIdResponse/GetMatchingProductForIdResult/Products/Product
should not actually have matched anything, because the GetMatchingProductForIdResponse etc. elements in your document are in a namespace, but you got away with it because DocumentBuilderFactory is by default not namespace aware. The correct thing to do here is to use a namespace-aware parser, and provide a suitable namespace context to the XPath engine. There's no default implementation of NamespaceContext available in the core Java library, unfortunately, but Spring provides a convenient SimpleNamespaceContext implementation you can use if you don't want to roll your own.
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true); // parse with namespaces
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new InputSource(new ByteArrayInputStream(xmlResponse.getBytes("utf-8"))));
doc.getDocumentElement().normalize();
XPath xPath = XPathFactory.newInstance().newXPath();
SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
xPath.setNamespaceContext(nsCtx);
nsCtx.bindNamespaceUri("prod", "http://mws.amazonservices.com/schema/Products/2011-10-01");
nsCtx.bindNamespaceUri("ns2", "http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd");
String expression = "/prod:GetMatchingProductForIdResponse/prod:GetMatchingProductForIdResult/prod:Products/prod:Product‌​/prod:AttributeSets/ns2:ItemAttributes/ns2:Binding";
// ...

Java XPath: Get all the elements that match a query

I want to make an XPath query on this XML file (excerpt shown):
<?xml version="1.0" encoding="UTF-8"?>
<!-- MetaDataAPI generated on: Friday, May 25, 2007 3:26:31 PM CEST -->
<Component xmlns="http://xml.sap.com/2002/10/metamodel/webdynpro" xmlns:IDX="urn:sap.com:WebDynpro.Component:2.0" mmRelease="6.30" mmVersion="2.0" mmTimestamp="1180099591892" name="MassimaleContr" package="com.bi.massimalecontr" masterLanguage="it">
...
<Component.UsedModels>
<Core.Reference package="com.test.test" name="MasterModel" type="Model"/>
<Core.Reference package="com.test.massimalecontr" name="MassimaleModel" type="Model"/>
<Core.Reference package="com.test.test" name="TravelModel" type="Model"/>
</Component.UsedModels>
...
I'm using this snippet of code:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document document = builder.parse(new File("E:\\Test branch\\test.wdcomponent"));
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(new NamespaceContext() {
...(omitted)
System.out.println(xpath.evaluate(
"//d:Component/d:Component.UsedModels/d:Core.Reference/#name",
document));
What I'm expecting to get:
MasterModel
MassimaleModel
TravelModel
What I'm getting:
MasterModel
It seems that only the first element is returned. How can I get all the occurrences that matches my query?
You'll get a item of type NodeList
XPathExpression expr = xpath.compile("//Core.Reference");
NodeList list= (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < list.getLength(); i++) {
Node node = list.item(i);
System.out.println(node.getTextContent());
// work with node
See How to read XML using XPath in Java
As per that example, If you first compile the XPath expression then execute it, specifying that you want a NodeSet back you should get the result you want.

Categories

Resources