After converting XSD to java objects using XJC , I would like to generate an xml file giving xpath and value to the xpath.
Examples.
Say I'm giving xpath and the value as
customer/name = XXXXX_VALUE
It should assign internally to the generated objects ... CustomerType.setName() ..
An XML also should generate as expected following the Xpath rule.
I know in Castor we can do this using ClassDescripor and FieldDescriptor. But I would like to know how to do this using XJC
JXPath can be used to navigate javabeans via something similar to xpaths.
http://commons.apache.org/proper/commons-jxpath/
Specifically, when you supply a factory you can create objects. There are several situations that are not supported nativly, but with a little bit of thought you can implement your own extension of createPathAndSetValue that can deal with your specific predicate logic.
http://commons.apache.org/proper/commons-jxpath/users-guide.html#Creating_Objects
Related
Given following XML, we are using JXPathContext to create Java object out of it.
<fb1:Activity fb2:metadata="Activity1">
</fb1:Activity>
<fb21:ActivityMetadata fb2:id="Activity1">
<fb1:Response>XXXX</fb1:Response>
</fb1:ActivityMetadata>
reading the value -
String responseCode = context.getValue("metadata[1]/Response/value");
This is working as expected. Now lets say for instance, the reference from Activity to ActivityMetadata is missing. What can we do to read the response value in such case? It is guaranteed that there can only be one ActivityMetadata element at max in the XML.
Incomplete XML - need to parse this
<fb1:Activity fb2:metadata="">
</fb1:Activity>
<fb21:ActivityMetadata>
<fb1:Response>XXXX</fb1:Response>
</fb1:ActivityMetadata>
The path you're giving us doesn't match the document you're showing us.
Ignoring that for a moment:
XML doesn't constrain the tree at all; that's done by the XML Schema (if there is one) and/or the applications which process that kind of document. Only the folks who defined this particular kind of document, or the schema, or the code can tell you whether there are any guarantees about only one ActivityMetadata being present or what it means if there's more than one.
XML is pure syntax. Meaning is someone else's problem.
I have an XML file that is structured something like this:
<element1>
<element2>
<element3>
<elementIAmInterestedIn attribute="data">
<element4>
<element5>
<element6>
<otherElementIAmInterestedIn>
<data1>text1</data1>
<data2>text2</data2>
<data3>text3</data3>
</otherElementIAmInterestedIn>
</element6>
</element5>
</element4>
</elementIAmInterestedIn>
<elementIAmInterestedIn attribute="data">
<element4>
<element5>
<element6>
<otherElementIAmInterestedIn>
<data1>text1</data1>
<data2>text2</data2>
<data3>text3</data3>
</otherElementIAmInterestedIn>
</element6>
</element5>
</element4>
</elementIAmInterestedIn>
<elementIAmInterestedIn attribute="data">
<element4>
<element5>
<element6>
<otherElementIAmInterestedIn>
<data1>text1</data1>
<data2>text2</data2>
<data3>text3</data3>
</otherElementIAmInterestedIn>
</element6>
</element5>
</element4>
</elementIAmInterestedIn>
</element3>
</element2>
</element1>
As you can see, I am interested in two elements, the first of which is deeply nested within the root element, and the second of which is deeply nested within that first element. There are multiple (sibling) elementIAmInterestedIn and otherElementIAmInterestedIn elements in the document.
I want to parse this XML file with Java and put the data from all the elementIAmInterestedIn and otherElementIAmInterestedIn elements into either a data structure or Java objects - it doesn't matter much to me as long as it is organized and I can access it later.
I'm able to write a recursive DOM parser method that does a depth-first traversal of the XML so that it touches every element. I also wrote a Java class with JAXB annotations that represents elementIAmInterestedIn. Then, in the recursive method, I can check when I get to an elementIAmInterestedIn and unmarshal it into an instance of the JAXB class. This works fine except that such an object should also contain multiple otherElementIAmInterestedIn.
This is where I'm stuck. How can I get the data out of otherElementIAmInterestedIn and assign it to the JAXB object? I've seen the #XmlWrapper annotation, but this seems to only work for one layer of nesting. Also, I cannot use #XmlPath.
Maybe I should scratch that idea and use a whole new approach. I'm really just getting started with XML parsing so perhaps I'm overlooking a more obvious solution. How would you parse an XML document structured like this and store the data in an organized way?
Maybe you should use SAX parser instead of DOM. When you use DOM you are loading all the document in memory and in your case you only want to read 2 fields. This is quite inefficient.
Using sax parser you'll be able to read only those nodes that you are interested in. Here is a pseudocode for your task using a SAX parsing model:
1) Keep reading nodes until you get <elementInterestedIn> node
2) Grab that field in your class
3) Keep on reading until you get <otherElementInterestedIn> node
4) Grab that field too and save the object.
Loop from 1 to 4 until it reachs the end of document.
If you try this aproach, i suggest you first reading this document to understand how SAX parser works, it's very different from DOM aproach: How to Use SAX
Assuming some xml like
<foo>
<bar>test</bar>
</foo>
Evaluating an expression with returnType = String like
/foo/bar
will return "test". However, I'd like to get the serialized xml instead, so something like
<bar>test</bar>
should be returned instead. As I can not check for the returnType in java's xpath implementation (xerces), I cannot simply get an object as result and if it indeed is a node, convert it to serialized xml.
Note: I don't know whether the expression will actually return a node, a string, a number or whatever so I cannot provide an appropriate return type to the eval function except string which, as my problem states, returns the text content and not the serialized xml.
So I am curious -> is there either a java- or (preferred) a xpath-way (function?) to get serialized xml for type string instead of the text children of the selected node?
thanks!
Alex
use the xpath return type XPathConstants.NODE and then you can serialize the returned Node yourself.
Now, you are right to observe that it's difficult to discover the return type of the result; this is a real design weakness of JAXP.
If it's a problem to you, consider using Saxon's s9api interface, which returns XdmValue objects whose type you can interrogate; you also get XPath 2.0 access as a bonus.
As Michael Kay answered, this is difficult in JAXP (the native Java interface).
In Mr Kay's Saxon library's s9api API (see Evaluating XPath Expressions using s9api), once you've called XPathSelector.evaluate() or XPathSelector.evaluateSingle() you can get the XML serialisation by calling XdmValue.toString().
However, if the XPath selected an attribute (e.g. //#name) you will still get the XML serialisation, e.g. name="value". You can call XdmItem.getStringValue(), but for elements that method will return the same values you're already seeing - the textual content of the element, not the serialisation. I've posted separately about how to distinguish between attributes and elements returned from Saxon s9api.
While there is plenty of documentation about XML document structure, there are very few links to how non textual data (eg. integer and decimal numbers, boolean values) should be managed.
Is there a consolidated standard? Could tell me a good starting point?
POST SCRIPT
I add an example because the question is actually too generic.
I've defined a document structure:
<rectangle>
<width>12.45</width>
<height>23.34</heigth>
<rounded_corners>true</rounded_corners>
</rectangle>
Since I'm using DOM, the API is oriented to textual data. Say doc is an instance of Document. doc.createTextNode("takes a string"). That is: API doesn't force towards a particular rapresentation.
So two question arise:
1) saving bolean true as 'true' instead of uhm, 1/0 is a standard?
2) i have to define simple methods that write and parse from and to this string representations. Example. I may use java wrappers to do this conversion:
Float f = 23.34;
doc.createTextNode(f.toString());
but does it adhere to the xml standard for decimal numbers? If so, i can say that a non-java programs can read this data because the data representation is xml.
why not jaxb
This is just an example, my xml is made up of a big tree of data and i need to write and read just a little part of it, so JAXB seems not to fit very well. Binding architecture tends to establish a tight copuling between xml structure and class structure. Good implementations like moxy let you loosen this coupling, but there are cases in wich the two structures simply don't match. In those cases, the DTO used to adapt them would be more work than using DOM.
Of course, there are dozens of official standards related to core XML. Some of the important ones:
XML schema 1.1 overview
XML schema 1.1 - datatypes ('XSD')
XML schema 1.1 - structures
Core XML overview
XML 1.0
XML 1.1
Further specification like XPath, XQuery are available at www.w3.org/standards/xml/. As you tagged this question with java you may be interested in JAXB as well, which defines a standard approach for mapping XML to Java and vice versa.
You can leverage the javax.xml.bind.DatatypeConverter class to convert simple Java types to a String that conforms to the XML Schema specification:
Float f = 23.34f;
doc.createTextNode(DatatypeConverter.printFloat(f));
Are there any helpers that will transform/escape a string to be a valid XML name ?
Example, I have the string max(OfAll) and need to generate some XML like e.g.
<max(OfAll)>SomeText</<max(OfAll)>
That's obviously not a valid name, are there some helper methods that can transform the string to be a valid xml name ?
(For comparison, .NET have some methods that the above xml fragment would be:
<max_x028_OfAll_x028_>SomeText</<max_x028_OfAll_x028_>)
The encoding in your .NET example looks like the one defined in ISO9075. I don't think there is a built-in implementation in the jdk, but this encoding is also used by content repositories like alfresco or jackrabbit for their xml import/exports and query apis. A quick search turned up these two implementations, both available under open source licenses:
http://www.docjar.com/html/api/org/apache/jackrabbit/util/ISO9075.java.html
http://kickjava.com/src/org/alfresco/util/ISO9075.java.htm
One class which may be of use in other situations is StringEscapeUtils in the apache commons-lang project. It can escape text for use in XML documents, I'm not aware of anything to escape XML element names.
Could you not generate something more readable such as
<aggregation type="max(OfAll)">SomeText</aggregation>
There are lots of libraries available to marshall/unmarshall objects to xml and back including JAXB (part of the JDK), JiBX, Castor, XStream
I don't know of any helper methods for that, but rules here http://www.w3.org/TR/REC-xml/#NT-Name are pretty straightforward, so it should be easy to implement one.
As should be clear, normal XML escaping (replacing inappropriate characters with character entities) does not result in a valid XML identifier.
For the record, what you are doing is frequently called "name mangling".