I am trying to extract a 'PartyID' from a request using XPath. This request is in the form of XML.
Here is the XML:
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<s1:invokerules xmlns:s1="http://rules.kmtool.abc.com"><s1:arg0><![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<kbdInitiateRequest>
<kmTestHeader>
<MessageId>USER1_MSG1</MessageId>
<TestDate>08/07/2008 07:34:15</TestDate>
<TestReference>
<ConductorReference>
<InvokeIdentifier>
<RefNum>USER1_Ref1</RefNum>
</InvokeIdentifier>
</ConductorReference>
</TestReference>
<TestParty>
<ConductorParty>
<Party PartyID="123456789" AgencyID="DUNS">
<TestContact>
<DetailedContact>
<ContactName>Michael Jackson</ContactName>
<Telephone>02071059053</Telephone>
<TelephoneExtension>4777</TelephoneExtension>
<Email>Michal.Jackson#Neverland.com</Email>
<Title>Mr</Title>
<FirstName>Michael</FirstName>
<Initials>MJ</Initials>
</DetailedContact>
</TestContact>
</Party>
</ConductorParty>
<PerformerParty>
<Party PartyID="987654321" AgencyID="DUNS">
</Party>
</PerformerParty>
</TestParty>
</kmTestHeader>
<kmToolMessage>
<controlNode>
<userRequest>INITIATE</userRequest>
</controlNode>
<customer>
<circuitID>000111333777</circuitID>
</customer>
</kmToolMessage>
</kbdInitiateRequest>
]]></s1:arg0>
</s1:invokerules>
</soapenv:Body>
</soapenv:Envelope>
I have a method in my java code called getPartyId(). This method should extract the PartyID from the XML. However I cannot get this method to return the PartyID no matter what XPath query I use, this is where I need help.
Here is the getPartyId method:
private String getPartyId(String xml) throws XPathExpressionException
{
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
xpath.setNamespaceContext(new NamespaceContext() {
public String getNamespaceURI(String prefix) {
if (prefix == null) throw new NullPointerException("Null prefix");
else if ("SOAP-ENV".equals(prefix)) return "http://schemas.xmlsoap.org/soap/envelope/";
else if ("xml".equals(prefix)) return XMLConstants.XML_NS_URI;
return XMLConstants.NULL_NS_URI;
}
public String getPrefix(String uri) {
throw new UnsupportedOperationException();
}
public Iterator getPrefixes(String uri) {
throw new UnsupportedOperationException();
}
});
XPathExpression expr = xpath.compile("/SOAP-ENV:Envelope/SOAP-ENV:Body/*/*/*/*/*/*/*/*/*/*/*[local-name()='PartyID']/text()");
InputSource source = new InputSource(new StringReader(xml));
String dunsId = (String) expr.evaluate(source,XPathConstants.STRING);
return dunsId;
}
I believe that the problem lies with the XPathExpression:
XPathExpression expr = xpath.compile("/SOAP-ENV:Envelope/SOAP-ENV:Body/*/*/*/*/*/*/*/*/*/*/*[local-name()='PartyID']/text()");
I have tried a number of alternatives for 'expr' however none of these have worked. Has anyone got any ideas?
Because the xml you need to parse is sitting inside a CDATA block, you'll need to re-parse the value of s1:arg0 before accessing data within it.
You will need to do this in 2 steps
You will need to access the arg0 node in the http://rules.kmtool.abc.com namespace.
Since you don't have a NamespaceContext for this inner xmlns, you can use :
/SOAP-ENV:Envelope/SOAP-ENV:Body/*[local-name()='invokerules']
/*[local-name()='arg0']/text()
You then need to load this value into another InputSource.
The PartyId attribute can be accessed via the path:
kbdInitiateRequest/kmTestHeader/TestParty/ConductorParty/Party/#PartyID
(no need to use local-name() since there aren't any xmlns in the CDATA)
Notice that your inner xml is inside CDATA node.
So basiclly you are trying to query path of an XML inside CDATA.
As this thread state
Xpath to the tag inside CDATA
Seems this is not possible :(
I would suggest take the CData inside the code and parse it into a new XML Document and query that.
Thanks,
Amir
Related
I am new to xpath , and I have never dealt with xml on java. I want to get values from a xml. The tags may be preceded by mgns1: or not. So I wrote this code :
private List<String> parse(Node node, String file) throws XPathExpressionException {
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(new NamespaceContext() {
public String getNamespaceURI(String prefix) {
return prefix.equals("mgns1") ? "urn:edeveloper.Fournisseurs1031af" : null;
}
public Iterator<?> getPrefixes(String val) {
return null;
}
public String getPrefix(String uri) {
return null;
}
});
Node node_codreg = (Node) xpath.evaluate("mgns1:CODREG", node, XPathConstants.NODE);
...
}
I tried with a xml which does not have the mgns1:. But at runtime I get no ListNodes ! So what is wrong ?
edit :
here is an example of an xml :
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<Fournisseurs>
<JournalExtract>2</JournalExtract>
<Record>
<STATUTRECORD>S</STATUTRECORD>
<CODFOUR>148</CODFOUR>
<RAISOC></RAISOC>
<ADRFOUR></ADRFOUR>
<CPVILLE></CPVILLE>
<CODPAYS></CODPAYS>
<TELEPHONE></TELEPHONE>
<TELEX></TELEX>
<FAX></FAX>
<EMAIL></EMAIL>
<SIRET></SIRET>
<CONDPAIE></CONDPAIE>
<MODPAIE></MODPAIE>
<LIVR></LIVR>
<REMISE>0.00</REMISE>
<DEVISE></DEVISE>
<CLASSE></CLASSE>
<DELMOY>0.00</DELMOY>
<TVAIC></TVAIC>
<MOTCLE></MOTCLE>
<DTEAGR>00/00/0000</DTEAGR>
<CODREG></CODREG>
<MTMINFAC>0.00</MTMINFAC>
<MTMINFRANCO>0.00</MTMINFRANCO>
<ZL01></ZL01>
<ZL02></ZL02>
<INDQUAL></INDQUAL>
<CERTIF></CERTIF>
<DTEVALMIN>00/00/0000</DTEVALMIN>
<DTEVALMAX>00/00/0000</DTEVALMAX>
<RAISOCREGL></RAISOCREGL>
<ADRREGL></ADRREGL>
<CPVILLEREGL></CPVILLEREGL>
<PAYSREGL></PAYSREGL>
<DOMBQE></DOMBQE>
<CODEBQE></CODEBQE>
<CODGUI></CODGUI>
<COMPTE></COMPTE>
<RIB></RIB>
<TYPETVA></TYPETVA>
<IBANPAYS></IBANPAYS>
<IBANCLE>00</IBANCLE>
<IBANCOMPTE></IBANCOMPTE>
<CODEBIC></CODEBIC>
<ROUTAGECDE></ROUTAGECDE>
<ACHSYSFRTVA>false</ACHSYSFRTVA>
<URL></URL>
<REMINCPXNET>false</REMINCPXNET>
<NOTMANSYST>false</NOTMANSYST>
<PROSPECT>false</PROSPECT>
<FOUPREF>false</FOUPREF>
<FOUPPAL></FOUPPAL>
<DTEMODTRI>00/00/0000</DTEMODTRI>
<NUMDUNS></NUMDUNS>
<CODLGFOU></CODLGFOU>
<NOALIMAUTSF>false</NOALIMAUTSF>
<AUTCDECH>false</AUTCDECH>
<SEUILEPDIF>false</SEUILEPDIF>
<MTMAXCDECH>0.00</MTMAXCDECH>
<MTMAXCC>0.00</MTMAXCC>
<MTMAXCCHCT>0.00</MTMAXCCHCT>
<CP></CP>
<VILLE></VILLE>
<CPREGL></CPREGL>
<VILREGL></VILREGL>
<CAMINST>0.00</CAMINST>
<CAMAXST>0.00</CAMAXST>
<OCCASION>false</OCCASION>
<ID_EXT></ID_EXT>
<TAXE2></TAXE2>
<TAXE3></TAXE3>
<TAXE4></TAXE4>
<DTECREDEM>00/00/0000</DTECREDEM>
<BDC_ELEC>false</BDC_ELEC>
<TYPE_FORM></TYPE_FORM>
<FORMAT>0</FORMAT>
<MODE_ENV></MODE_ENV>
<MAIL_DEST></MAIL_DEST>
<ADR_FTP></ADR_FTP>
<USR_FTP></USR_FTP>
<PWD_FTP></PWD_FTP>
<PATH_DEP></PATH_DEP>
<RECEPT_AUTO>false</RECEPT_AUTO>
<PERIODICITE></PERIODICITE>
<NOCCGEN>false</NOCCGEN>
</Record>
</Fournisseurs>
You are looking for elements named mgns1:CODREG, where mgns1 represents the namespace urn:edeveloper.Fournisseurs1031af.
In the XML document you have shown us, there are no elements in namespace urn:edeveloper.Fournisseurs1031af. So why would you expect your expression to select anything?
Moreover, you're only looking for direct children of the supplied node and you haven't told us what this node is. Perhaps you want to be looking for all descendants, not just direct children?
I am very new in XPath and I have the following problem:
I have a Java method that receives data from a webservices and these data are in a XML document, so I have to use XPath to take a specific value inside this XML result document.
In particular I have that this is the entire XML output provided by my web service (the web service response):
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body>
<getConfigSettingsResponse xmlns="http://tempuri.org/">
<getConfigSettingsResult><![CDATA[<root>
<status>
<id>0</id>
<message></message>
</status>
<drivers>
<drive id="tokenId 11">
<shared-secret>Shared 11</shared-secret>
<encoding>false</encoding>
<compression />
</drive>
<drive id="tokenId 2 ">
<shared-secret>Shared 2 </shared-secret>
<encoding>false</encoding>
<compression>false</compression>
</drive>
</drivers>
</root>]]></getConfigSettingsResult>
</getConfigSettingsResponse>
</s:Body>
</s:Envelope>
Now in a Java class I perform the following operations:
XPath xPath; // An utility class for performing XPath calls on JDOM nodes
Element objectElement; // An XML element
//xPath = XPath.newInstance("s:Envelope/s:Body/getVersionResponse/getVersionResult");
try {
// XPath selection:
xPath = XPath.newInstance("s:Envelope/s:Body");
xPath.addNamespace("s", "http://schemas.xmlsoap.org/soap/envelope/");
objectElement = (Element) xPath.selectSingleNode(documentXML);
if (objectElement != null) {
result = objectElement.getValue();
System.out.println("RESULT:");
System.out.println(result);
}
} catch (JDOMException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
and the result of printing the content of the result variable is this output:
RESULT:
<root>
<status>
<id>0</id>
<message></message>
</status>
<drivers>
<drive id="tokenId 11">
<shared-secret>Shared 11</shared-secret>
<encoding>false</encoding>
<compression />
</drive>
<drive id="tokenId 2 ">
<shared-secret>Shared 2 </shared-secret>
<encoding>false</encoding>
<compression>false</compression>
</drive>
</drivers>
</root>
Now my problem is that I want to access only ad the content of the 0 tag, so I want that (in this case) my result variable have to contain the 0 value.
But I can't, I have try to change the previous XPath selection with:
xPath = XPath.newInstance("s:Envelope/s:Body/s:status/s:id");
But doing in this way I obtain that my objectElement is null
Why? What am I missing? What have I to do to obtain that mu result variable contains the content of the id tag?
Tnx
Andrea
Yours "root" node in "CDATA" section. Whole section interpetated as text, and you cannot search it by xPath. You can get text from "objectElement.getValue()", parse it like new XML, and then get tag "id" value with new xPath. Also you can search "objectElement.getValue()" for tag "id" value with regular expression.
Really you should be using the new XPathAPI in JDOM 2.x, and taking pasha701's answer in to consideration, your code should look more like:
Namespace soap = Namespace.getNamespace("s", "http://schemas.xmlsoap.org/soap/envelope/");
Namespace tempuri = Namespace.getNamespace("turi", ""http://tempuri.org/");
XPathExpression<Element> xpath = XPathFactory.instance().compile(
"s:Envelope/s:Body/turi:getConfigSettingsResponse/turi:getConfigSettingsResult",
Filters.element(), null, soap, tempuri);
Element result = xpath.evaluateFirst(documentXML);
String resultxml = result.getValue();
Document resultdoc = new SAXBuilder().build(new StringReader(resultxml));
Element id = resultdoc.getRootElement().getChild("status").getChild("id");
<rootNode>
<Movies>
<Movie id=1>
<title> title1</title>
<Actors>
<Actor>Actor1</Actor>
<Actor>Actor2</Actor>
<Actors>
</Movie>
</Movies>
<performers >
<performer id=100>
<name>name1</name>
<movie idref=1/>
</performer>
</performers>
</rootNode>
Question1: I only want to get the movie under the movies. I tried both of DOM and SAX. It also returns the under performers. How can I avoid this by using SAX or DOM
DOM:
doc.getElementsByTagName("movie");
SAX:
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("movie"))
Question2: How can I get the element inside element (Actor under movies) by using DOM or SAX?
Basically, what I want to do is output the data in order.
1,title, Actor1,Actor2
100,name1,1
doc.getElementsByTagName("movies")[0].childNodes;
gets you all the movies/movie nodes (watch for lower-/upper-case!). See here http://www.w3schools.com/dom/dom_intro.asp for a short tutorial.
XPath is designed for this type of extraction. For your example file, the query would be something like the following. For simplicity, I assumed your xml was in a res/raw, but in practice you will need to create the InputSource from where ever you are getting your xml.
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "/rootNode/Movies/Movie";
try {
NodeList nodes = (NodeList) xpath.evaluate(expression, doc,XPathConstants.NODESET);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
XML :
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<wmHotelAvailResponse xmlns="http://host.com/subPath">
<OTA_HotelAvailRS Version="1.001">
</OTA_HotelAvailRS>
</wmHotelAvailResponse>
</soap:Body>
</soap:Envelope>
Code :
String xpathString = "/soap:Envelope/soap:Body/wmHotelAvailResponse/OTA_HotelAvailRS";
AXIOMXPath xpathExpression = new AXIOMXPath(xpathString);
xpathExpression.addNamespace("soap", "http://schemas.xmlsoap.org/soap/envelope/");
xpathExpression.addNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
xpathExpression.addNamespace("xsd", "http://www.w3.org/2001/XMLSchema");
OMElement rsMsg = (OMElement)xpathExpression.selectSingleNode(documentElement);
String version = rsMsg.getAttribute(new QName("Version")).getAttributeValue();
Question :
This is working perfectly when the xmlns="http://host.com/subPath" part is deleted. I wanna know how can I add xmlns="http://host.com/subPath" part to the xpathExpression to make the above work
I tried below but didn't work.
xpathExpression.addNamespace("", "http://host.com/subPath");
Solution:
.1. Add this code:
xpathExpression.addNamespace("x", "http://host.com/subPath");
.2. Change:
String xpathString = "/soap:Envelope/soap:Body/wmHotelAvailResponse/OTA_HotelAvailRS";
to:
String xpathString = "/soap:Envelope/soap:Body/x:wmHotelAvailResponse/x:OTA_HotelAvailRS";
Explanation:
Xpath always treats any unprefixed element name as belonging to "no namespace".
Therefore when evaluating the XPath expression:
/soap:Envelope/soap:Body/wmHotelAvailResponse/OTA_HotelAvailRS
the Evaluator tries to find a wmHotelAvailResponse element that is in "no namespace" and fails because the only wmHotelAvailResponse element in the document belongs to the "http://host.com/subPath" namespace.
I'm trying to parse an XML string, and the tagnames are variable; I haven't seen any examples on how to pull the information out without knowing them. For example, I will always know the <response> and <data> tags below, but what falls inside/outside of them could be anything from <employee> to you name it.
<?xml version="1.0" encoding="UTF-8"?>
<response>
<generic>
....
</generic>
<data>
<employee>
<name>Seagull</name>
<id>3674</id>
<age>34</age>
</employee>
<employee>
<name>Robin</name>
<id>3675</id>
<age>25</age>
</employee>
</data>
</response>
You could parse it into a generic dom object and traverse it. For example, you could use dom4j to do this.
From the dom4j quick start guide:
public void treeWalk(Document document) {
treeWalk( document.getRootElement() );
}
public void treeWalk(Element element) {
for ( int i = 0, size = element.nodeCount(); i < size; i++ ) {
Node node = element.node(i);
if ( node instanceof Element ) {
treeWalk( (Element) node );
}
else {
// do something....
}
}
}
public Document parse(URL url) throws DocumentException {
SAXReader reader = new SAXReader();
Document document = reader.read(url);
return document;
}
I have seen similar situation in the projects.
If you are going to deal with large XMLs, you can use Stax or Sax parser to read the XML. On every step (like on reaching end element), enter the data into a Map or a dta structure of your choice, where you keep tag names as the key and value as value in the Map. Finally once you have the parsing done, use this Map to figure out which object to build as finally you would have a proper entity representation of the information in the XML
If XML is small,use DOM and directly build the entity object by reading the specific tag (like employee> or use XPATh to where you expect the tag to be present, giving you hint of the entity. Build that object directly by reading the specific information from the XML.