I am attempting to update some xml parsers, and have hit a small snag. We have an xsd that we need to keep compatible with older versions of the xml, and we had to make some changes to it. We made the changes in a new version of the xsd, and we would like to use the same parser (as the changes are pretty small in general, and the parser can easily handle both). We are using the XMLReader property "http://java.sun.com/xml/jaxp/properties/schemaSource" to set the schema to the previous edition, using something like the following:
xmlReader.setProperty("http://java.sun.com/xml/jaxp/properties/schemaSource",
new InputSource(getClass().getResourceAsStream("/schema/my-xsd-1.0.xsd")));
This worked fine when we only had one version of the schema. Now we have a new version, and we want the system to use whichever version of the schema is defined in the incoming xml. Both schemas define a namespace, something like the following:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.mycompany.com/my-xsd-1.0"
xmlns="http://www.mycompany.com/my-xsd-1.0"
elementFormDefault="unqualified" attributeFormDefault="unqualified">
and, for the new one:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.mycompany.com/my-xsd-1.1"
xmlns="http://www.mycompany.com/my-xsd-1.1"
elementFormDefault="unqualified" attributeFormDefault="unqualified">
So, they have different namespaces and different schema "locations" defined. We don't want the schema to live on the 'net - we want it to be bundled with our system. Is there a way to use the setProperty mechanism to do this behavior, or is there a different way to handle this?
I tried putting both resources in an input stream in an array as the parameter, but that didn't work (I remember reading somewhere that this was a possible solution - although now I can't find the source, so it might have been wishful thinking).
So, it turns out what I had tried actually worked - we were accidentally using invalid xml! What works (for anyone else who is interested) is the following:
List<InputSource> inputs = new ArrayList<InputSource>();
inputs.add(new InputSource(getClass().getResourceAsStream("/schema/my-xsd-1.0.xsd")));
inputs.add(new InputSource(getClass().getResourceAsStream("/schema/my-xsd-1.1.xsd")));
xmlReader.setProperty("http://java.sun.com/xml/jaxp/properties/schemaSource",
inputs.toArray(new InputSource[inputs.size()]));
Personally I think it's generally a bad idea to change the namespace when you version a schema, unless the changes are radical - but views differ on that, and you seem to have made your decision, and you may as well reap the benefits.
Since you're using two different namespaces, the schemas are presumably disjoint, so you should be able to give the processor a schema that's the union of the two - I don't know if there's a better way, but one way of achieving this is to write a little stub schema that imports both, and supply this stub as your schemaSource property. The processor will use whichever schema declarations match the namespace of the elements in the source document.
(Using version-specific namespaces makes this task - validation - easier. But it makes subsequent processing of the XML, e.g. using XPath, harder, because it's hard to write code that works with both namespaces.)
Related
I have a bunch of external XSDs which I can't change.
They use wildcard content models (xs:any) with processContents="skip".
Question:
Is there a programmatic way in java/JAXP to force wildcard (xs:any) processContents="strict" matching instead (without changing the XSDs)?
Sure. Use Java to modify the schema before using it for validation. You don't have to change the original schema, just the one that you're validating against.
If you're using XSD 1.1 you could create the locally-modified schema using xs:override.
I am looking for an extension of doc() functionality currently available in SAXON in a way that it will read XML not from filesystem or from http network, but from memory, where I have those xmls.
The way I want to use it is like:
mydoc('id')/root/subroot/#myattr
or
doc('mydoc://id')/root/subroot/#myattr
What I have considered so far:
use queryEvaluator.setContextItem() - does not solve my use case as I can have multiple XML sources in one query
register some own URL scheme protocol into Java - seems to me like overkill and I have never done this
write own ExtensionFunction - seems to be the right way so far, but i am confused whether I should use ExtensionFunction or rather ExtensionFunctionDefinition. Also I am littel bit confused by Doc_1 and Doc Saxonica source code as it uses Atomizer and other unknown internall stuff.
So the questions are:
Is it variant 3 the best one (in the means of simplicity) or would you recommend some other approach ?
Is it OK to use ExtensionFunction and return XdmNode from my in-memory xmls ? It seems to me it should work, but I really do not want to step into some edge cases or saxon minefield.
Any comment from experienced Saxon user will be appretiated.
The standard way of doing this is to write a URIResolver and register it with the transformer. The URIResolver is called, supplying the requested URI, and it is expected to return a Source (which can be a StreamSource, SAXSource, or DOMSource, for example). In this scenario you would typically return a StreamSource wrapping a StringReader which wraps the String containing the XML.
You could equally well use an extension function, but it's probably a little bit more complicated.
I easily found JAXB for importing XML into Java code, however, after looking at it a bit more, I started wondering if it were more than I really needed.
It should be rather simple XML that I or other users would create.
For example:
<Type>Armor Material</Type> //could be various types of parent objects
<Name>Steel</Name> //object properties
<Toughness>10</Toughness>
<Type>Armor Material</Type>
<Name>Iron</Name>
<Toughness>7</Toughness>
For the background on my problem: I have a game written in Java, and aim to have many Objects of certain types defined in the XML. I'm hoping to keep the XML as simple as possible for easy user-modding.
I know how to read from a file for creating my own custom solution - but I have never dealt with marshalling/unmarshalling and JAXB in general. I won't lie - something about it intimidates me, maybe because it seems like this "black box" which I don't quite understand.
Are there clear advantages to argue for learning how to get it work, as opposed to implementing a solution I already know I can get to work?
You definitely want to use JAXB.
Whether your XML is simple or complex, write an XML schema (xsd) file. You want the schema file anyway, so you can validate the files you are reading. Use xjc (part of JAXB) to generate Java classes for all the element of your XML schema (complete with setters/getters). Then, it is a one-liner to read or write an XML file.
Because the XML file is mapped to/from Java objects, it is very easy to manipulate these data structures (to create or consume them) in Java.
JAXB is a plugin architecture and there are quite a few open source plugins that you can utilize to enhance the generated classes. By default, JAXB generates all your setters/getters automatically, but there are plugins that will generate equals/hashcode, fluent-style methods, clone, etc. There is even a plugin (hyperjaxb3) that will put JPA annotations on the generated classes, so you can go XML->Java->database->Java->XML all based on the XML schema.
I have worked on projects that used JAXB to generate POJOs even though we didn't need XML - it was quicker to write and easier to maintain the XML schema than all the Java code for the POJOs.
If you're using Java 8, perhaps a dynamic style would be a good fit
XmlDynamic xml = new XmlDynamic(
"<items>" +
"<item>" +
"<type>Armor Material</type>" +
"<name>Steel</name>" +
"<toughness>10</toughness>" +
"</item>" +
"<item>" +
"<type>Armor Material</type>" +
"<name>Iron</name>" +
"<toughness>7</toughness>" +
"</item>" +
"</items>"
);
xml.get("items|item|name").asString(); // "Steel"
xml.get("items|item[1]|toughness").convert().intoInteger(); // 7
see https://github.com/alexheretic/dynamics#xml-dynamics
I know that I can compile multiple xsd files in a single jar. I've tried using different namespaces which only takes me half way through my goal. This way I can parse the correct schema but I want this to be transparent to my users which will receive the xmlBeans object that I've parsed.
They don't have to know which version of xml file is currently present on the system. I would need a super class for every xsd version to achieve this.
Could this be done with xmlBeans?
My understanding is, if you have a com namespace and a com.v1 and com.v2 namespace and you have an xsd element called EmployeeV1 in com.v1 and EmployeeV2 in com.v2.
You want to a super class called Employee in the com namespace which you want to return to your caller?
Do you think EmployeeV1 and EmployeeV2 could extend from Employee in your xsd? Then maybe when you generate you will get the class hierarchy that represents your xsd.
If that doesn't work, (i haven't used xmlbeans in years now), you might have to create your own domain object and make your callers consume that. That might be worth the effort, since to me it looks like you handle the parsing of an XML that other people rely on, you could abstract all other users from the structure of the XML (which is in flux) by having an intermediary domain object.
I have a schema in xsd file. once in a while a new version of the schema is created, and I need to update my .ecore (and .genmodel).
How do I update them, without deleting them and re-generate them. I have made some manual modification to the ecore, and i want to keep this modifications.
Ido.
Use the Reload... action on the *.genmodel to update the *.ecore based on the new version of the *.xsd.
And don't change the .ecore directly. Using ecore: annotations in the schema. http://www.eclipse.org/modeling/emf/docs/overviews/XMLSchemaToEcoreMapping.pdf
I've never tried this, but the XSD FAQ says this:
JAXB produces a simple Java API given
an XML Schema and it does so using
essentially a black box design. EMF
produces an Ecore model given an XML
Schema and then uses template-based
generator technology to generate a
rich Java API (of hand written
quality). The XML Schema to Ecore
conversion can be tailored, the
templates used to generate the Java
API can be tailored, and the resulting
Java API can be tailored. The
generator supports merging
regeneration so that it will preserve
your hand written changes. In other
words, EMF is far richer and more
flexible, and supports a broader
subset of XML Schema (especially in
2.0, where wildcards and mixed content will be supported).
If I were you, I'd try some experiments to see how well this process works, and what the practical limitations are.
You can regenerate using the context menu options. To preserve your modifications:
If there is a method that has "Gen" added to the name -- e.g. setWhateverGen in addition to setWhatever -- new code will be generated to the "Gen" method. So leave the "Gen" method alone so that it can be overwritten, and then call it from the non-Gen method, which you can modify.
All the generated methods are annotated with #generated. If you add "NOT" -- #generated NOT -- it will not be overwritten.
All other content should be merged. Go ahead and experiment -- that's what version control is for....