remove xs:annotation elements from schema - java

I have a number of XSD schemas with too much documentation in them which makes them hard to read and use, how can I write a program to produce the equivalent XSD files with all the xs:annotation elements (including any xs:appinfo, xs:documentation or other elements they contain) removed whenever they may be found?

You could run each of your files through an XSLT to strip out the unwanted elements:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="xs:annotation" />
</xsl:stylesheet>
As noted by #IanRoberts You only really need to remove the xs:annotation elements and the other two types of elements will be removed along with them.

Related

Redact XML using XSLT

Wondering if anyone can help with an XSLT issue I am facing.
I am trying to create an xslt script which will take take as input an xml document and change the values of several fields to "xxxx" I have managed to get this part working however I would now only like this to run if one field in the input xml is of a specific value (e.g. if username is jbond)
I like to have this condition within my XSLT if possible however I am having difficulty.
My current XML, XSLT, Output and expected outputs are as follows
XML:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl"?>
<rootDoc>
<user>test</user>
<tel>12345</tel>
<zip>abcd</zip>
</rootDoc>
XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" >
<xsl:output method="xml" omit-xml-declaration="no" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:if test="user = 'test'">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:if>
</xsl:template>
<xsl:template match="tel/text()">XXXX</xsl:template>
<xsl:template match="zip/text()">XXXX</xsl:template>
</xsl:stylesheet>
Output:
<?xml version="1.0" encoding="UTF-8"?>
<rootDoc/>
Expected:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl"?><rootDoc>
<user>test</user>
<tel>XXXX</tel>
<zip>XXXX</zip>
</rootDoc>
If you leave the identity transform alone, but add specific matches, XSLT will automagically find the closest match. You can customize the select="" or add more templates as necessary. hth
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" >
<xsl:output method="xml" omit-xml-declaration="no" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="tel"><tel>XXXX</tel></xsl:template>
<xsl:template match="zip"><zip>XXXX</zip></xsl:template>
</xsl:stylesheet>
Putting the test into the identity transformation template does not make much sense, if you want to perform certain changes when a condition holds then the templates
<xsl:template match="tel/text()">XXXX</xsl:template>
<xsl:template match="zip/text()">XXXX</xsl:template>
should be changed to
<xsl:template match="rootDoc[user = 'test']/tel/text()">XXXX</xsl:template>
<xsl:template match="rootDoc[user = 'test']/zip/text()">XXXX</xsl:template>
which could be joined into
<xsl:template match="rootDoc[user = 'test']/tel/text() | rootDoc[user = 'test']/zip/text()">XXXX</xsl:template>
With a single condition, assuming it is the rootNode, you can use
<xsl:template match="/rootNode[not(user = 'test')]">
<xsl:copy-of select="."/>
</xsl:template>
then the other cases are handled by the identity transformation and the specialized templates.

Add namespaces to XML from XSD

For example I have XML:
<a>
<b>c</b>
</a>
xsdA.xsd:
<xs:import schemaLocation="xsdB.xsd"/>
<xs:element name="a" xmlns:xsa="http://www.example.org/a" type="xsa:aType"></xs:element>
xsdB.xsd:
<xs:element name="b" xmlns:xsb="http://www.example.org/b" type="xsb:bType"></xs:element>
I want to somehow transform XML into this:
<xsa:a xmlns:xsa="http://www.example.org/a">
<xsb:b xmlns:xsb="http://www.example.org/b">c</xsb:b>
</xsa:a>
I hear that it can be done by JAXB, but is there any way to do this without code generation?
I use Java.
EDIT:
It's just example. I can add namespaces by xslt or manualy in DOM object, but my xsds is 170K size now. And it often changes. I want just replace the xsd and program keep working.
How can i found namespace of element with given localname in bunch of xsd files?
EDIT2:
All localnames seems to be different in my xsds.
As laune suggested, a very basic XSLT that should get you started.
XSLT:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsa="http://www.example.org/a"
xmlns:xsb="http://www.example.org/b">
<xsl:output omit-xml-declaration="no" indent="yes" />
<xsl:template match="a">
<xsa:a><xsl:apply-templates select="node() | #*" /></xsa:a>
</xsl:template>
<xsl:template match="b">
<xsb:b><xsl:apply-templates select="node() | #*" /></xsb:b>
</xsl:template>
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Input:
<a>
<b>c</b>
</a>
Output:
<xsa:a xmlns:xsa="http://www.example.org/a" xmlns:xsb="http://www.example.org/b">
<xsb:b>c</xsb:b>
</xsa:a>

Delete XSL output file generation at runtime

I am writing text output files reading an XML file using XSL.
Here i am trying to check weather a particular content is available in the source XML and write that content to a file if available.
But if the content is not available ( not fulfilling "<XSL:if>" condition), then output file would be an empty file.
So I want to add an else condition and in that else condition to avoid XSL output file being created at runtime.
Any body having any clue?
<xsl:message terminate="yes"> wont help because it does generate the output but only terminating the further processing of XSL.
Can any body help or even suggest any other approach to be taken in java code even without deleting files after they have created. [By reading them and identifying empty files]
Currently I am using java to read the created empty files and delete them explicitly. Thanks in adavance.
I will give two examples how this can be done -- the second is what I recommend:
Suppose we have this XML document:
<nums>
<num>01</num>
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>08</num>
<num>09</num>
<num>10</num>
</nums>
and we want to produce another one from it, in which the num elements with even numbers are "deleted".
One way of doing this is:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<nums>
<xsl:apply-templates/>
</nums>
</xsl:template>
<xsl:template match="num">
<xsl:choose>
<xsl:when test=". mod 2 = 1">
<num><xsl:value-of select="."/></num>
</xsl:when>
<!-- <xsl:otherwise/> -->
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The wanted result is produced:
<nums>
<num>01</num>
<num>03</num>
<num>05</num>
<num>07</num>
<num>09</num>
</nums>
Do notice: For "not doing anything" you even don't need the <xsl:otherwise> and it is commented out.
A better solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="num[. mod 2 = 0]"/>
</xsl:stylesheet>
This produces the same correct result.
Here we are overriding the identity rule with a template matching num elements with even value and with empty body -- which does the "delete".
Do notice:
Here we don't use any "if-then-else" explicit instructions at all -- just Xtemplate pattern matching, which is the most distinguishing feature of XSLT.

XSLT for XML containing multiple namespaces

I'm working on an XSLT that is giving me a little headache, and was looking for some tips. I'm working on converting an XML where some of the tags have namespaces prefixes, and others do not. I am working to convert all of the tags to one common namespace prefix.
Example of XML:
<yes:Books>
<no:Book>
<Title>Yes</Title>
<maybe:Version>1</maybe:Version>
</no:Book>
</yes:Books>
What I'm trying to get:
<yes:Books>
<yes:Book>
<yes:Title>Yes</yes:Title>
<yes:Version>1</yes:Version>
</yes:Book>
</yes:Books>
The XML input is the aggregate of several webservices, that are returning various namespaces. I have no issue aggregating it together appropriately, it's creating one common prefix namespace that I am having an issue with.
Worst case, I could regex them away, but I'm sure that isn't recomended.
Thanks.
This transformation allows the wanted final prefix and its namespace to be specified as external/global parameters. It shows how to process in the same way attribute names:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pPrefix" select="'yes'"/>
<xsl:param name="pNamespace" select="'yes'"/>
<xsl:template match="*">
<xsl:element name="{$pPrefix}:{local-name()}" namespace="{$pNamespace}">
<xsl:apply-templates select="node()|#*"/>
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{$pPrefix}:{local-name()}" namespace="{$pNamespace}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
when applied on the following document (the provided one with one added attribute to make the problem more challenging):
<yes:Books xmlns:yes="yes">
<no:Book xmlns:no="no">
<Title no:Major="true">Yes</Title>
<maybe:Version xmlns:maybe="maybe">1</maybe:Version>
</no:Book>
</yes:Books>
produces the wanted, correct result:
<yes:Books xmlns:yes="yes">
<yes:Book>
<yes:Title yes:Major="true">Yes</yes:Title>
<yes:Version>1</yes:Version>
</yes:Book>
</yes:Books>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:custom="c">
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="custom:{local-name()}" namespace-uri="c">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>

xslt help - remove empty tags and replaces * with empty tags

I am having a problem while doing some XSLT pre-processing in my java program.
We get an asterisk (*) from a mainframe program when it wants to blank out a value, which my java process has to treat like a blank or empty tag. So we apply an xslt to the input before my jaxb process.
We are applying this xslt :
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no"/>
<xsl:template match="#*[. = '*']">
<xsl:attribute name="{name()}" namespace="{namespace-uri()}">
<xsl:text></xsl:text>
</xsl:attribute>
</xsl:template>
<xsl:template match="*[. = '*']">
<xsl:copy>
<xsl:text></xsl:text>
</xsl:copy>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The above xslt works fine for ALMOST all test cases.
Except in the case where there is only ONE sub-element and that happens to be an asterisk.
For instance consider this in the input:
<MYROOT><Address3><Line2>*</Line2><Line3>*</Line3></Address3></MYROOT>
works well.
It produces this output:
<MYROOT><Address3><Line2/><Line3/></Address3></MYROOT>
The xml input below, however , produces an incorrect response.
<MYROOT><Address4><PermanentAddress><Line2>*</Line2></PermanentAddress></Address4></MYROOT>
But instead of giving the response as
<MYROOT><Address4><PermanentAddress><Line2></Line2></PermanentAddress></Address4></MYROOT>
It gives this:
<MYROOT/>
Please help. Any help is appreciated as I did not have this test case while testing my code.
That's because . is the inner text, which is a concatenation of all inner text nodes. You need to make sure in your condition that there is no child node either or only a text node with * as contents.
This should work:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no"/>
<xsl:strip-space elements="*"/>
<xsl:template match="*[not(*) and (. = '*')] | #*[. = '*']">
<xsl:copy />
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Replace:
<xsl:template match="*[. = '*']">
<xsl:copy>
<xsl:text></xsl:text>
</xsl:copy>
</xsl:template>
with
<xsl:template match="*[not(*) and not(text()[2])]/text()[.='*']"/>
This is much more efficient than having to calculate the string value of every element, because the string value of an element is the concatenation of all its descendent text nodes.

Categories

Resources