i'm learning to parse XML files and using XPath to do querys. I don't know how to list all the Names but i don't want them repeated. Is there any option or should i do it manually?
<Return>
<ReturnData>
<Person>
<Name>Samuel</Name>
</Person>
<Person>
<Name>Samuel</Name>
</Person>
</ReturnData>
</Return>
In XPath 2.0 and higher, use distinct-values(//Name).
Java's built-in XPath processor only supports XPath 1.0, in which this query is surprisingly difficult, but there are third-party Java libraries supporting XPath 2.0, 3.0, and 3.1, notably Saxon. Saxon-HE is open source, see http://saxon.sf.net/.
Related
can we use tokenize function in XPath
The general java code i use to process XSLT and XML files are :
XPath xPath = XPathFactory.newInstance().newXPath();
InputSource inputXML = new InputSource(new StringReader(xml));
String expression = "/root/customer/personalDetails[age=tokenize('20|30','|')]/name";
boolean evaluate1 = (boolean) xPath.compile(expression).evaluate(inputXML, XPathConstants.BOOLEAN);
XML :-
<?xml version="1.0" encoding="ISO-8859-15"?>
<root>
<customer>
<personalDetails>
<name>ABC</name>
<value>20</value>
</personalDetails>
<personalDetails>
<name>XYZ</name>
<value>21</value>
</personalDetails>
<personalDetails>
<name>PQR</name>
<value>30</value>
</personalDetails>
</customer>
</root>
Expected Response :- ABC,PQR
Yes, you can use the tokenize() function in XPath, provided your XPath processor supports XPath 2.0 or later.
For Java, the popular choice of XPath 2.0+ processor is Saxon.
You can use the JAXP API with Saxon, however, it's not really designed to work well with XPath 2.0+, so it's preferable to use Saxon's own API (called s9api).
For this particular example, you don't need tokenize(). In XPath 2.0+ you can write
[age=('20', '30')]
I use XSLT to convert the XML to JSON. I use XSLT instead of Jackson/org.json as XSLT retains the namespace information.
For example, for the below SOAP XML request,
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<AccountDetailsRequest xmlns="http://com/blog/demo/webservices/accountservice">
<accountNumber>12345</accountNumber>
</AccountDetailsRequest>
</soap:Body>
</soap:Envelope>
converts it to the following JSON.
{"soap:Envelope":{"soap:Body":{"AccountDetailsRequest":{"accountNumber":"12345"}}}}
The namespace definition is lost. But, I plan to store the namespace definition in a map.
I use Jackson/org.json[both return similar results] to convert it back to XML and I get the following XML:
<soap:Envelope>
<soap:Body>
<AccountDetailsRequest>
<accountNumber>12345</accountNumber>
</AccountDetailsRequest>
</soap:Body>
</soap:Envelope>
The only part I am not able to figure out is the way to add the namespace definition assuming I can store that information in a map.
I considered adding a <root> </root> with all the namespaces in it, as W3C standard specifies it is a valid way to do so. But SOAP does not accept such XML.
Any way to get back the XML with proper namespace information?
When you say you're using XSLT to do the conversion, I guess that means you are doing it "by hand", rather than by using the XSLT 3.0 xml-to-json() function (which in fact wouldn't help you very much here).
But if your code has limitations, then it's hard to help you without seeing your code.
It's easy enough to find out all the namespaces in scope for an element by using the namespace axis, I'm not sure if you're trying that and getting it wrong, or if you're unaware of the feature. That's why we need to see your code.
It appears that the org.json converter accepts namespace declarations such as in:
String expectedStr =
"{\"addresses\":{\"address\":{\"name\":\"\",\"nocontent\":\"\","+
"\"something\":[1, 2, 3]},\"xsi:noNamespaceSchemaLocation\":\"test.xsd\",\""+
"xmlns:xsi\":\"http://www.w3.org/2001/XMLSchema-instance\"}}";
so you need to modify your XML-to-JSON converter to generate such declarations. The code you've adopted does
<xsl:apply-templates select="#*" mode="attr" />
at line 39 to process attributes. In XSLT 2.0+ you could process namespaces the same way using select="namespace::*". IIRC correctly however, XSLT 1.0 can't match namespace nodes in a pattern, so if you're stuck on 1.0 you would need to use something like
<xsl:for-each select="namespace::*">
...
</xsl:for-each>
I am using XStream library for XML parsing. I was wondering if the library allows jumping to a particular node directly using the index.
So for e.g.
<details>
<personal>
<basicInfo>
<firstName>John</firstName>
<lastName>Doe</lastName>
<phoneNumber>9999999999</phoneNumber>
<dateOfBirth>1990-01-01</dateOfBirth>
</basicInfo>
<address>
<street>random St.</street>
<city>City</city>
<stateProv>BC</stateProv>
<country>CA</country>
<postCode>12345</postCode>
</address>
</personal>
<personal>
<basicInfo>
<firstName>John2</firstName>
<lastName>Doe2</lastName>
<phoneNumber>9999999999</phoneNumber>
<dateOfBirth>1990-01-01</dateOfBirth>
</basicInfo>
<address>
<street>random St.2</street>
<city>City2</city>
<stateProv>BC2</stateProv>
<country>CA2</country>
<postCode>12345</postCode>
</address>
</personal>
</details>
For the XML above I would like to skip the first <personal>...</personal>
and only process the second node. Can I call it using an index.
XStream is a simple library to serialize objects to XML and back again.
I am not sure what you mean by process in this context, but if your POJO for serialization is set up correctly to contain a List of "personal" nodes. I don't see why you couldn't deserialize the XML and remove the unwanted node after the fact.
As far as I know, vtd-xml is the only XML parsing routine that natively offers indexing feature, called vtd+XML.
I'm currently using JDOM for doing some simple XML parsing, and it seems like nothing's type safe - I had a similar issue with using the built-in Java DOM parser, just with lots more API to wade through.
For example, XPath.selectNodes takes an Object as its argument and returns a raw list, which just feels a little Java 1.1
Are there generic-ized XML and XPath libraries for Java, or is there some reason why it's just not possible to do XPath queries in a type-safe way?
If you're familiar with CSS selectors on HTML, it may be good to know that Jsoup supports XML as well.
Update: OK, that was given the downvote apparently a very controversial answer. It may however end up to be easier and less verbose than Xpath when all you want is to select node values. The Jsoup API is namely very slick. Let's give a bit more concrete example. Assuming that you have a XML file which look like this:
<?xml version="1.0" encoding="UTF-8"?>
<persons>
<person id="1">
<name>John Doe</name>
<age>30</age>
<address>
<street>Main street 1</street>
<city>Los Angeles</city>
</address>
</person>
<person id="2">
<name>Jane Doe</name>
<age>40</age>
<address>
<street>Park Avenue 1</street>
<city>New York</city>
</address>
</person>
</persons>
Then you can traverse it like follows:
Document document = Jsoup.parse(new File("/persons.xml"), "UTF-8");
Element person2 = document.select("person[id=2]").first();
System.out.println(person2.select("name").text());
Elements streets = document.select("street");
for (Element street : streets) {
System.out.println(street.text());
}
which outputs
Jane Doe
Main street 1
Park Avenue 1
Update 2: since Jsoup 1.6.2 which was released March 2012, XML parsing is officially supported by the Jsoup API.
AFAIK all of the xml queries in java are non-typesafe and most are java 1.3 compatible. That said my favorite parser/generator is the xml pull parser (xmlpp) style parser. I believe java has XmlStreamReader and XmlStreamWriter if you're using 1.6 which are almost the same as the xmlpp library. I especially like that I can write a method getFoo that takes a stream reader and pulls from it and returns a Foo object. It's sort of the best between DOM and SAX. I think it may be referred to as StAX by some.
I'm getting a little ramble-y so I'm quitting now
According to this, you can make use of xs:key and xs:keyref when marshalling and unmarshalling data in JAXB 2.x.
However, I can't find a working example of this being done anywhere.
What we're doing is setting a lookup section in each XML message containing the details for the reference/code values (id, name, description, etc), and then have the data elements later in the message refer back to these items using their key. XML schema defines and supports this through xs:keyref and xs:key (xs:IDREF is not an allowed option).
What I'd like to do is have my JAXB unmarshaller follow these refs dynamically, replacing the key with the referenced object.
Could anybody refer me to an example of this being done?
Are you talking about a compound key situtation?
<directory>
<employee>
<eID>123</eID>
<country>CA</country>
</employee>
<employee>
<eID>123</eID>
<country>US</country>
</employee>
<employee>
<eID>456</eID>
<country>US</country>
</employee>
<phone-number>
<contact eID="123" country="US"/>
</phone-number>
</directory>
If so EclipseLink JAXB (MOXy) could be used:
http://wiki.eclipse.org/EclipseLink/Examples/MOXy/JPA/CompoundPrimaryKeys