Unmarshalling without unique node names - java

I am stuck trying to figure out how to unmarshall an XML file supplied by IBM Cognos.
The structure does not provide unique names for the different child nodes under the element but there is a block of metadata that defines the order of the values.
This is a simplified sample of the XML file.
<?xml version="1.0" encoding="utf-8"?>
<dataset xmlns="http://developer.cognos.com/schemas/xmldata/1/" xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<!--
<dataset
xmlns="http://developer.cognos.com/schemas/xmldata/1/"
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xs:schemaLocation="http://developer.cognos.com/schemas/xmldata/1/ xmldata.xsd"
>
-->
<metadata>
<item name="EmployeeID" type="xs:string" length="20"/>
<item name="firstName" type="xs:string" length="50"/>
<item name="lastName" type="xs:string" length="50"/>
</metadata>
<data>
<row>
<value>EMP1</value>
<value>Joe</value>
<value>Blogs</value>
</row>
<row>
<value>EMP2</value>
<value>Mary</value>
<value>Soap</value>
</row>
</data>
</dataset>
I'm using Spring OXM and Castor for this project and I have no control over the XML format as I am pulling it via a web service from a third party system.
Update : I'm not adverse to swapping out Castor for a different marshalling/unmarshalling library.

The magic of XSLT to the rescue. By running the provided XML through the following XSLT stylesheet I was able to create an XML file that I could then unmarshall correctly.
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:cognos="http://developer.cognos.com/schemas/xmldata/1/">
<xsl:output method="xml" version="1.0" encoding="UTF-8" standalone="yes" indent="yes"/>
<xsl:template match="/">
<xsl:element name="DataSet">
<xsl:for-each select="//*[name()='row']">
<xsl:variable name="row" select="position()" />
<xsl:element name="Row">
<xsl:for-each select="//*[name()='item']">
<xsl:variable name="elementName" select="#name" />
<xsl:variable name="index" select="position()" />
<xsl:element name="{translate($elementName,' ','_')}">
<xsl:value-of select="//cognos:row[$row]/cognos:value[$index]" />
</xsl:element>
</xsl:for-each>
</xsl:element>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
This transformed the XML file as follows
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<DataSet>
<Row>
<EmployeeID>EMP1</EmployeeID>
<firstName>Joe</firstname>
<lastName>Blogs</lastName>
</Row>
<Row>
<EmployeeID>EMP2</EmployeeID>
<firstName>Mary</firstname>
<lastName>Soap</lastName>
</Row>
</DataSet>

Related

XSLT 3.0 Identity transform document collection?

I have one XSLT 3.0:
<xsl:stylesheet version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:map="http://www.w3.org/2005/xpath-functions/map"
xmlns:n1="urn:hl7-org:v3">
<xsl:output indent="yes" method="xml" encoding="utf-8"/>
<xsl:param name="icd10Map" as="map(xs:string, xs:string)"
select="
map {
'1742': 'C502',
'55090': 'K409',
'8442': 'S8350',
'7172': 'M2332',
'36616': 'H251',
'4550': 'K648'
}"/>
<xsl:variable name="map-keys" select="map:keys($icd10Map)"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="n1:translation[#codeSystemName = 'ICD-9-CM']/#code">
<xsl:attribute name="code">
<xsl:value-of select="$icd10Map($map-keys[translate(normalize-space(current()), '
.;', '') = .])"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
One input XML:
<?xml-stylesheet type="text/xsl" href="./Content/xsl/CDA.xsl"?>
<ClinicalDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:hl7-org:v3 NIST_C32_schema/C32_CDA.xsd" xmlns="urn:hl7-org:v3" xmlns:sdtc="urn:hl7-org:sdtc">
<realmCode code="US" />
<typeId root="2.16.840.1.113883.1.3" extension="POCD_HD000040" />
<templateId root="2.16.840.1.113883.10.20.22.1.1" />
************************************************************
<id extension="TT988" root="2.16.840.1.113883.19.5.99999.1" />
<code codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" code="11504-8" displayName="Surgical Operation Note" />
<title>Operative Report</title>
****************************************************
<component>
<structuredBody>
<component>
<section>
***********************
<entry>
<act moodCode="EVN" classCode="ACT">
<templateId root="2.16.840.1.113883.10.20.22.4.65" />
<code code="10219-4" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" displayName="Preoperative Diagnosis" />
<entryRelationship typeCode="SUBJ">
<observation classCode="OBS" moodCode="EVN">
<code code="282291009" displayName="Diagnosis" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" />
<statusCode code="completed" />
<!-- ICD-9 be transformed to ICD-10 -->
<value nullFlavor="OTH" type="CD">
<translation code="366.16" displayName="Nuclear sclerosis" codeSystem="2.16.840.1.113883.6.103" codeSystemName="ICD-9-CM" />
</value>
</observation>
</entryRelationship>
</act>
</entry>
</section>
</component>
</structuredBody>
</component>
</ClinicalDocument>
The transformation scenario in Oxygen with just one document is without issue:
*******************
<value nullFlavor="OTH" type="CD">
<translation code="H251"
displayName="Nuclear sclerosis"
codeSystem="2.16.840.1.113883.6.103"
codeSystemName="ICD-9-CM"/>
</value>
******************
However, XSLT 3.0 identity transform for collection seems working like this:
<xsl:variable name="inFile" as="node()*" select="collection('hl7.xml')"/>
<xsl:template match="/">
<xsl:text>
ICD9 Target Transformation in the collection is:
</xsl:text>
<xsl:for-each select="$inFile//n1:translation[#codeSystemName = 'ICD-9-CM']/#code">
<xsl:value-of select="$icd10Map($map-keys[translate(normalize-space(current()), '
.;', '') = .])" separator=" , "/>
</xsl:for-each>
</xsl:template>
Result:
ICD9 Target Transformation in the collection is:
H251
H251
K648
K648
K409
K409
S8350
M2332
M2332
S8350
If I change the XSLT to:
<xsl:mode on-no-match="shallow-copy"/>
<xsl:variable name="inFile" as="node()*" select="collection('hl7.xml')"/>
<xsl:template match="n1:translation[#codeSystemName = 'ICD-9-CM']/#code">
<xsl:text>
ICD9 Target Transformation in the collection is:
</xsl:text>
<xsl:for-each select="$inFile">
<xsl:attribute name="code">
<xsl:value-of select="$icd10Map($map-keys[translate(normalize-space(current()), '
.;', '') = .])" separator=" , "/>
</xsl:attribute>
</xsl:for-each>
</xsl:template>
It doesn’t appear any transform happen and is merely an extraction of the URI list in the catalog file hl7.xml.
I develop a Java application which can bulk validate documents against XSD, transform (without collection()) and finally write the documents into database. The logging is the desired result:
Engine Instantiation: com.fc.andante.sax.SAXValidateStreamTransformWrite
Schema Validation Status: files in:/ml/Andante/data/data are validated against schema file:/ml/Andante/data/operation-transform.xsd
User 'auditor' has validated files in:/ml/Andante/data/data on 2020-08-26T23:05:26.357431
*****************
Transaction Status: Authenticating database writer...
Transaction Status: User audited as 'super' is transforming document set...
Transaction Status: Document data/data/cataract.xml is successfully transformed and written into database with uri '/xslt-transform/cataract.xml'
Transaction Status: Document data/data/breast-surgery.xml is successfully transformed and written into database with uri '/xslt-transform/breast-surgery.xml'
Transaction Status: Document data/data/hernia.xml is successfully transformed and written into database with uri '/xslt-transform/hernia.xml'
Transaction Status: Document data/data/colonoscopy.xml is successfully transformed and written into database with uri '/xslt-transform/colonoscopy.xml'
Transaction Status: Document data/data/knee.xml is successfully transformed and written into database with uri '/xslt-transform/knee.xml'
Die Transaktion wurde erfolgreich abgeschlossen 2020-08-26T23:05:28.341385700
Can anyone help to solve the XSLT 3.0 Collections transform issue?
I would use a global parameter
<xsl:param name="inFiles" as="document-node()*" select="collection('hl7.xml')"/>
and then start processing with a named template
<xsl:template name="xsl:initial-template">
<xsl:for-each select="$inFiles">
<xsl:result-document href="/xslt-transform/{tokenize(document-uri(), '/')[last()]}">
<xsl:apply-templates/>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
Then you can xsl:import your first XSLT sample from the question or of course edit it to insert the code I have shown. Make sure you let Saxon start with the named template (-it command line option for Saxon; in oXygen by not providing a source document).

Unable to fetch the specific entire XML tag by using XSLT

I have the below sample XML file
Sample XML:
<?xml version="1.0" encoding="UTF-8"?>
<testng-results skipped="0" failed="0" total="10" passed="10">
<class name="com.transfermoney.Transfer">
<test-method status="PASS" name="setParameter" is-config="true" duration-ms="4"
started-at="2018-08-16T21:43:38Z" finished-at="2018-08-16T21:43:38Z">
<params>
<param index="0">
<value>
<![CDATA[org.testng.TestRunner#31c2affc]]>
</value>
</param>
</params>
<reporter-output>
</reporter-output>
</test-method> <!-- setParameter -->
</class>
<class name="com.transfermoney.Transfer">
<test-method status="FAIL" name="setSettlementFlag" is-config="true" duration-ms="5"
started-at="2018-08-16T21:44:55Z" finished-at="2018-08-16T21:44:55Z">
<reporter-output>
<line>
<![CDATA[runSettlement Value Set :false]]>
</line>
</reporter-output>
</test-method> setSettlementFlag
</class>
</testng-results>
I just want to take the below piece of tags from above XML file based on status PASS (I don't want to take <?XML version, <testng-results> and class tags those are should be ignored).
Expected Output:
<test-method status="PASS" name="setParameter" is-config="true" duration-ms="4"
started-at="2018-08-16T21:43:38Z" finished-at="2018-08-16T21:43:38Z">
<params>
<param index="0">
<value>
<![CDATA[org.testng.TestRunner#31c2affc]]>
</value>
</param>
</params>
<reporter-output>
</reporter-output>
</test-method>
I just used below XSLT to get the above output from sample XML file but It doesn't work It returned all the tags but I just want the above output not other than anything.
XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml"/>"
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="class"/>
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
<xsl:for-each select="test-method[#status='PASS']">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:for-each>
</xsl:copy>
</xsl:stylesheet>
Also Using the below java code to run the XSLT and sample XML file
Code:
String XML = fetchDataFrmXML(".//Test//testng-results_2.xml");
Transformer t = TransformerFactory.newInstance().newTransformer(new StreamSource(new StringReader(XSL)));
t.transform(new StreamSource(new StringReader(XML)), new StreamResult(new File(".//Test//Sample1.xml")));
This is the sample payload. But the actual payload had multiple nodes with "PASS" and "Failed" status. I'm just only interested to fetch the PASS node in the above output format.
Any leads....
The result you show could be obtained quite simply by doing just:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/testng-results">
<xsl:copy-of select="class/test-method[#status='PASS']" />
</xsl:template>
</xsl:stylesheet>
However, in case of more than one test-method having a status of "PASS" this will result in an XML fragment with no single root element. So you'd probably be better off doing:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/testng-results">
<root>
<xsl:copy-of select="class/test-method[#status='PASS']" />
</root>
</xsl:template>
</xsl:stylesheet>

How to copy <xml> tag inside CDATA by XSLT

I want to add nodes to my xml data and keep CDATA format unchangeble i have xml file in which i write xml content which should be copied in my shablon xml
here is xml contetn which should be added to shablon.xml
<?xml version="1.0" encoding="UTF-8"?><Data>
<RawData Format="Text">
<organizationNameEng>fgfgfg</organizationNameEng>
<organizationNameGeo>dfdfdf</organizationNameGeo>
<organizationIdentifier>123456789</organizationIdentifier>
<cardNumber>dfdfdf</cardNumber></RawData>
<response>
<id>0123</id>
<content>her is content</content>
</response>
</Data>
here is my shablon xml
<?xml version="1.0" encoding="UTF-8"?><DataPrep xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Lcid="String">
<Job Name="CompleteJob" Type="Normal" Priority="1">
<JobDetails>![CDATA[ here is job details ]]</JobDetails>
<DetailData>
<Information>
<RawData Format="Text"><</RawData>
</Data>
<response>
...
</response>
</Information>
</DetailData>
</Job>
here is my xlst
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0"
encoding="UTF-8" indent="yes" cdata-section-elements="RawData"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Detaildata/information">
<xsl:copy>
<xsl:copy-of select="document('content.xml')"/>
</xsl:template>
</xsl:copy>
</xsl:stylesheet>
which will give me response like this:
<?xml version="1.0" encoding="UTF-8"?><DataPrep xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Lcid="String">
<Job Name="CompleteJob" Type="Normal" Priority="1">
<JobDetails>&lt here is job details &gt</JobDetails>
<DetailData>
<Information>
<RawData Format="Text"><![CDATA[
<organizationName>fgfgfg</organizationNameEn>
<organizationIdentifier>123456789</organizationIdentifier>
<cardNumber>dfdfdf</cardNumber>
]]></RawData>
</Data>
</Information>
</DetailData>
</Job>
</DataPrep>
how should i change my xslt file that i could paste tag inside ..]] and also how should i avoid situation where response tag in xslt and doesn't change <CDATA> format with &gt &lt here
<JobDetails>&lt here is job details &gt</JobDetails>

XSLT skip duplicate element

I am a beginner to XSLT.
My Source XML is as below:
<Passengers>
<Passenger type="A" id="P1"/>
<Passenger type="A" id="P2"/>
<Passenger type="B" id="P3"/>
<Passenger type="C" id="P4"/>
</Passengers>
The out-put should be as below:
<Pax_Items>
<Item>
<Type>A</Type>
<Count>2</Count>
</Item>
<Item>
<Type>B</Type>
<Count>1</Count>
</Item>
<Item>
<Type>C</Type>
<Count>1</Count>
</Item>
</Pax_Items>
I have created XSLT as below
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0" exclude-result-prefixes="xmlns">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes" />
<xsl:variable name="filter" select="'TK,AJ'"/>
<xsl:template match="Passengers">
<xsl:element name="Pax_Items">
<xsl:apply-templates select="Passenger"/>
</xsl:element>
</xsl:template>
<xsl:template match="Passenger">
<xsl:element name="Item">
<xsl:element name="Type">
<xsl:value-of select="#type"/>
</xsl:element>
<xsl:element name="Count">
<xsl:value-of select="count(//Passenger[#type=current()/#type])"/>
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
With above XSLT i got the below output:
<Pax_Items>
<Item>
<Type>A</Type>
<Count>2</Count>
</Item>
<Item>
<Type>A</Type>
<Count>2</Count>
</Item>
<Item>
<Type>B</Type>
<Count>1</Count>
</Item>
<Item>
<Type>C</Type>
<Count>1</Count>
</Item>
</Pax_Items>
How can i omit or skip the duplicate element? Please help.
This is actually a good example of a grouping problem. In XSLT1.0, the most efficient way to do grouping is with a technique called "Muenchian Grouping", so it might be worthwhile learning about this.
In this case, you want to group Passenger elements by their #type attribute, so you would define a key to do this
<xsl:key name="Passengers" match="Passenger" use="#type"/>
Then, you need to select the Passenger elements which happen to be the first occurence of that element in the group for their #type attribute. This is done as follows:
<xsl:apply-templates
select="Passenger[generate-id() = generate-id(key('Passengers', #type)[1])]"/>
Note the use of generate-id which generates a unique ID for a node, allowing two nodes to be compared.
Then, to count the number of occurences in the group, it is straight-forward
<xsl:value-of select="count(key('Passengers', #type))"/>
Here is the full XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="Passengers" match="Passenger" use="#type"/>
<xsl:template match="Passengers">
<Pax_Items>
<xsl:apply-templates select="Passenger[generate-id() = generate-id(key('Passengers', #type)[1])]"/>
</Pax_Items>
</xsl:template>
<xsl:template match="Passenger">
<Item>
<Type>
<xsl:value-of select="#type"/>
</Type>
<Count>
<xsl:value-of select="count(key('Passengers', #type))"/>
</Count>
</Item>
</xsl:template>
</xsl:stylesheet>
When applied to your sample XML, the following is output
<Pax_Items>
<Item>
<Type>A</Type>
<Count>2</Count>
</Item>
<Item>
<Type>B</Type>
<Count>1</Count>
</Item>
<Item>
<Type>C</Type>
<Count>1</Count>
</Item>
</Pax_Items>
Also note there is no real reason to use xsl:element to output static elements. Just write out the element directly.
Update your passenger template as follows; I have added if condition to check duplicate nodes,
<xsl:template match="Passenger">
<xsl:if test="not(preceding-sibling::Passenger[#type = current()/#type])">
<xsl:element name="Item">
<xsl:element name="Type">
<xsl:value-of select="#type"/>
</xsl:element>
<xsl:element name="Count">
<xsl:value-of select="count(//Passenger[#type=current()/#type])"/>
</xsl:element>
</xsl:element>
</xsl:if>
</xsl:template>

Bypassing namespaces while copying an XML with XSLT

Starting from an XML with a default namespace:
<Root>
<A>foo</A>
<B></B>
<C>bar</C>
</Root>
I apply an XSLT to remove the 'C' element:
<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="no" encoding="utf-8" />
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="C" />
</xsl:stylesheet>
and I end up with the following XML (it's OK to have 'B' not collapsed because I'm using HTML as output method):
<Root>
<A>foo</A>
<B></B>
</Root>
But then if I ever get another XML, this time with a namespace:
<Root xmlns="http://company.com">
<A>foo</A>
<B></B>
<C>bar</C>
</Root>
the 'C' element is not removed after XSLT process.
What can I do to bypass this namespace, is there a way?
Not so recommendable, but works:
<xsl:template match="*[local-name()='C']" />
Better:
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:foo="http://company.com"
exclude-result-prefixes="foo"
>
<!-- ... -->
<xsl:template match="C | foo:C" />
<!-- ... -->
</xsl:stylesheet>

Categories

Resources