Differentiate XMLs based on namespace in Apache Camel - java

I am using Spring Boot and Apache Camel in my project. The architecture is some XML is coming from an input queue to Camel layer where it is transformed to another XML using XSLT and them the final XML is sent to an output queue.The XML which is coming is of the following form
<tns:Standalone xmlns:tns="namespace1">
<tns:name>Test</tns:name>
</tns:Standalone>
and this is correctly getting transformed using an XSLT. The problem is in my flow, the tns of the incoming XML can vary (say a different XML can come with tns as namespace2). Then the XSLT is failing. So I need to have logic to use differentiate the incoming XMLs based on tns valueso that I can use different XSLTs for both the scanarios. Can you please guide me how can I differentiate input XMLs based on tns?

Here's a simple example showing how you can use a single XSLT to handle equally nodes in two different namespaces:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ns1="namespace1"
xmlns:ns2="namespace2"
exclude-result-prefixes="ns1 ns2">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/ns1:Standalone | /ns2:Standalone">
<output>
<xsl:value-of select="ns1:name | ns2:name"/>
</output>
</xsl:template>
</xsl:stylesheet>
When this stylesheet is applied to either one of the following inputs:
XML 1
<tns:Standalone xmlns:tns="namespace1">
<tns:name>Test</tns:name>
</tns:Standalone>
XML 2
<tns:Standalone xmlns:tns="namespace2">
<tns:name>Test</tns:name>
</tns:Standalone>
the result will be:
Result
<?xml version="1.0" encoding="UTF-8"?>
<output>Test</output>

Related

Get the count of attributes using xpath in java without dom parser

I want to fetch the count of attributes using xpath in java. I know we can use DOM parsers but my input file is going to be very large. I can't really use SAX as there are multiple nested tags I need to take care of. I'm also not sure what all attributes are going to be inside the xml document. Having xpath would make my life easier but im really worried dom parser will choke the memory. I read about s9 apis but coudn't really solve it. Are there any other alternate libraries in JAVA that uses xpath without DOM parser? Sharing examples would be really helpful
Lets say my input is
<?xml version="1.0" encoding="UTF-8"?>
<cricketers>
<continent>
<team>
<aussies>
<cricketer type="righty">
<name>Smith</name>
<role>Captain</role>
<position>Wicket-Keeper</position>
</cricketer>
<cricketer type="lefty">
<name>Warner</name>
<role>Batsman</role>
<position>Point</position>
</cricketer>
</aussies>
</team>
</continent>
<continent>
<team>
<england>
<cricketer type="righty">
<name>Morgan</name>
<role>Captain</role>
<position>Covers</position>
</cricketer>
<cricketer type="lefty">
<name>Cook</name>
<role>Batsman</role>
<position>Point</position>
</cricketer>
</england>
</team>
</continent>
<continent>
<team>
<aussies>
<cricketer type="righty">
<name>Smith</name>
<role>Captain</role>
<position>Wicket-Keeper</position>
</cricketer>
<cricketer type="lefty">
<name>Warner</name>
<role>Batsman</role>
<position>Point</position>
</cricketer>
</aussies>
</team>
</continent>
</cricketers>
Given an xpath //team/aussies/cricketer, the count is 4 in this case.
I want to implement something like this without DOM parser
With XSLT 3 supporting streaming (e.g. with Saxon EE 10 or 9.9) you can use
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
expand-text="yes">
<xsl:output method="adaptive"/>
<xsl:mode streamable="yes"/>
<xsl:template match="/">
<xsl:sequence select="count(//#*)"/>
</xsl:template>
</xsl:stylesheet>
if the task is really only to count all attributes. Saxon should run that in a single, forwards only parse through the whole document without building a full tree of all nodes.
Counting elements selected without predicates doing child selection, like
<xsl:template match="/">
<xsl:sequence select="count(//team/aussies/cricketer)"/>
</xsl:template>
should also work.
In the s9api, you simply need to make sure you pass in the input document as a stream to the Xslt30Transformer e.g.
Processor processor = new Processor(true);
XsltCompiler compiler = processor.newXsltCompiler();
XsltExecutable executable = compiler.compile(new StreamSource("count-example1.xsl"));
Xslt30Transformer transformer = executable.load30();
XdmValue result = transformer.applyTemplates(new StreamSource("sample1.xml"));
System.out.println(result);

ArrayIndexOutOfBoundsException when transforming XML using XSLT

I am using the Java javax.xml.transform library in my Scala Play application to perform a simple XSLT transformation on some XML. I am trying to remove the namespace from one of the elements, but I am getting an exception when I POST XML to the endpoint which does the transformation.
The method I have written to do the transformation is below:
def transformXml(xml: String, xslName: String): Try[String] = {
Try {
// Create transformer factory
val factory: TransformerFactory = TransformerFactory.newInstance()
// Use the factory to create a template containing the xsl file
val template: Templates = factory.newTemplates(new StreamSource(new FileInputStream(s"app/xsl/$xslName.xsl")))
// Use the template to create a transformer
val xformer: Transformer = template.newTransformer()
// Prepare the input for transformation
val input: Source = new StreamSource(new StringReader(xml))
// Prepare the output for transformation result
val outputBuffer: Writer = new StringWriter
val output: javax.xml.transform.Result = new StreamResult(outputBuffer)
// Apply the xslt transformation to the input and store the result in the output
xformer.transform(input, output)
// Return the transformed XML
outputBuffer.toString
}
}
Through putting printlns in my code, I have deduced that it is in fact failing at the xformer.transform(input, output) line. The XML I am passing in and the XSL file I am using to transform are below:
<?xml version="1.0"?>
<Message xmlns="http://foo.bar" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance">
<EnvelopeVersion>2.0</EnvelopeVersion>
<Header>
<MessageDetails>
...
...
...
</MessageDetails>
<SenderDetails/>
</Header>
<OtherDetails>
<Keys/>
</OtherDetails>
<Body>
</Body>
</Message>
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:param name="ancestralNamespace" select="namespace-uri(/*[1])"/>
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:with-param name="ancestralNamespace" select="$ancestralNamespace"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="*[contains(namespace-uri(),'foo.bar')]">
<xsl:param name="ancestralNamespace" select="namespace-uri(..)"/>
<xsl:element name="{local-name()}" namespace="">
<xsl:apply-templates select="#*|node()">
<xsl:with-param name="ancestralNamespace" select="$ancestralNamespace"/>
</xsl:apply-templates>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
My expected output is this:
<?xml version="1.0"?>
<Message>
<EnvelopeVersion>2.0</EnvelopeVersion>
<Header>
<MessageDetails>
...
...
...
</MessageDetails>
<SenderDetails/>
</Header>
<OtherDetails>
<Keys/>
</OtherDetails>
<Body>
</Body>
</Message>
The error I get back from sending a POST request to my endpoint is this:
{
"statusCode": 500,
"message": "javax.xml.transform.TransformerException: java.lang.ArrayIndexOutOfBoundsException: -1"
}
I do not have much experience with XSLT and have inherited this code from someone else to try to debug, so if anyone with XML/XSLT experience could give me some help I would greatly appreciate it. The perplexing thing is that the person I got this problem from had written Unit Tests using this method (send in my example XML and get out the expected XML) and they passed so I don't know where to look next.
Right so after a few hours of debugging and fretting over this, I found the solution!
The default transformer which my Play application was using handles XSLT differently, and was getting confused at the line <xsl:param name="ancestralNamespace" select="namespace-uri(/*[1])"/>. What solved my issue was to use a different transformer. The one I found to work was Xalan (version 2.7.2), and after importing that into my project build file I hit the endpoint and the transformation was successful.
To import the version I found to work, add the following to your build:
"xalan" % "xalan" % "2.7.2" % "runtime"
I believe that the "runtime" section is the most important part, as it seems to overwrite what the application would normally use. I would guess that the reason my tests passed but my endpoint failed is that Scala Test runs with different configuration to runtime. Nothing else about my code had to be changed.
I hope this helps to stop anyone else from encountering this (admittedly rather unique) error! I ended up trawling through countless forums from as far back as 2002 before resorting to trying a different runtime configuration.

How to handle duplicate node names when converting xml to csv using java and xsl

I am given an xml file from an outside source (so I have no control over the attribute names) and unfortunately they use the same name for a paired set of data. I can't seem to figure out how to access the second value. An example of the data in the xml file is:
<?xml version="1.0"?>
<addressResponse>
<results>
<ownerName>Name1</ownerName>
<houseAddress>House1</houseAddress>
<houseAddress>CityState1</houseAddress>
<yearBuilt>Year1</yearBuilt>
</results>
<results>
<ownerName>Name2</ownerName>
<houseAddress>House2</houseAddress>
<houseAddress>CityState2</houseAddress>
<yearBuilt>Year2</yearBuilt>
</results>
</addressResponse>
I already have my java code together and can parse the xml but I need help handling the duplicate attribute name. I want my csv file to look like the following:
owner,address,citystate,yearbuilt
Name1,House1,CityState1,Year1
Name2,House2,CityState2,Year2
In my xsl file, I did the following "hoping" it would get the second houseAddress but it didn't:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" >
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">owner,address,citystate,yearbuilt
<xsl:for-each select="//results>
<xsl:value-of select="concat(ownerName,',',houseAddress,',',houseAddress,',',yearBuilt,'
')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
That gave me:
owner,address,citystate,yearbuilt
Name1,House1,House1,Year1
Name2,House2,House2,Year2
Is there a trick to do this? I can't get the attribute names changed from the originator so I'm stuck with them. Thank you in advance.
Use:
houseAddress[2]
to get the value of the second occurrence of the houseAddress element.
Note that we are assuming XSLT 1.0 here.

SXXP0003: Error reported by XML parser: Content is not allowed in prolog

My XML file is
<?xml version="1.0" encoding="ISO-8859-1"?>
<T0020
xsi:schemaLocation="http://www.safersys.org/namespaces/T0020V1 T0020V1.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.safersys.org/namespaces/T0020V1">
<INTERFACE>
<NAME>SAFER</NAME>
<VERSION>04.02</VERSION>
</INTERFACE>
<TRANSACTION>
<VERSION>01.00</VERSION>
<OPERATION>REPLACE</OPERATION>
<DATE_TIME>2009-09-01T00:00:00</DATE_TIME>
<TZ>CT</TZ>
</TRANSACTION>
<IRP_ACCOUNT>
<IRP_CARRIER_ID_NUMBER>564182</IRP_CARRIER_ID_NUMBER>
<IRP_BASE_COUNTRY>US</IRP_BASE_COUNTRY>
<IRP_BASE_STATE>AR</IRP_BASE_STATE>
<IRP_ACCOUNT_NUMBER>67432</IRP_ACCOUNT_NUMBER>
<IRP_ACCOUNT_TYPE>I</IRP_ACCOUNT_TYPE>
<IRP_STATUS_CODE>100</IRP_STATUS_CODE>
<IRP_STATUS_DATE>2008-02-01</IRP_STATUS_DATE>
<IRP_UPDATE_DATE>2009-06-18</IRP_UPDATE_DATE>
<IRP_NAME>
<NAME_TYPE>LG</NAME_TYPE>
<NAME>LARRY SHADDON</NAME>
<IRP_ADDRESS>
<ADDRESS_TYPE>PH</ADDRESS_TYPE>
<STREET_LINE_1>10291 HWY 124</STREET_LINE_1>
<STREET_LINE_2/>
<CITY>RUSSELLVILLE</CITY>
<STATE>AR</STATE>
<ZIP_CODE>72802</ZIP_CODE>
<COUNTY>POPE</COUNTY>
<COLONIA/>
<COUNTRY>US</COUNTRY>
</IRP_ADDRESS>
<IRP_ADDRESS>
<ADDRESS_TYPE>MA</ADDRESS_TYPE>
<STREET_LINE_1>10291 HWY124</STREET_LINE_1>
<STREET_LINE_2/>
<CITY>RUSSELLVILLE</CITY>
<STATE>AR</STATE>
<ZIP_CODE>72802</ZIP_CODE>
<COUNTY>POPE</COUNTY>
<COLONIA/>
<COUNTRY>US</COUNTRY>
</IRP_ADDRESS>
</IRP_NAME>
</IRP_ACCOUNT>
</T0020>
I am using following XSLT to split my xml file to multiple xml file .
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:t="http://www.safersys.org/namespaces/T0020V1" version="2.0">
<xsl:output method="xml" indent="yes" name="xml" />
<xsl:variable name="accounts" select="t:T0020/t:IRP_ACCOUNT" />
<xsl:variable name="size" select="30" />
<xsl:template match="/">
<xsl:for-each select="$accounts[position() mod $size = 1]">
<xsl:variable name="filename" select="resolve-uri(concat('output/',position(),'.xml'))" />
<xsl:result-document href="{$filename}" method="xml">
<T0020>
<xsl:for-each select=". | following-sibling::t:IRP_ACCOUNT[position() < $size]">
<xsl:copy-of select="." />
</xsl:for-each>
</T0020>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
It works well in Sample Java Apllication,but when i tried to use same in my Spring based application then it gives following error .
Error on line 1 column 1 of T0020:
SXXP0003: Error reported by XML parser: Content is not allowed in prolog.
I don't know what goes wrong ? Please help me. Thanks In Advance.
Your XML starts with a byte-order mark in UTF-8 (0xEF,0xBB,0xBF), which isn't visible. Try opening your file with a hex editor and have a look.
Many text editors under Windows like to insert this at the start of UTF-8 encoded text, despite the fact that UTF-8 doesn't actually need a byte order mark since the ordering of bytes in UTF-8 is already well defined.
Java's XML parsers will all choke on a BOM with exactly the error message you are seeing. You'll need to either strip out the BOM, or write a wrapper for your InputStream that you're handing the XML parser to do this for you at parsing time.
There is some content in the document before the XML data starts, probably whitespace at a guess (that's where I've seen this before).
The prolog is the part of the document that is before the opening tag, with tag-like constructs like <? and <!. You may have some characters/whitespace in between these tags too. Prologs and valid content are explained on tiztag.com.
Maybe post up an depersonalised example of your XML data?
It's also possible to get this if you attempt to process the content twice. (Which is fairly easy to do in spring.) In which case, there'd be nothing wrong with your XML. This scenario seems likely since the sample application works, but introducing spring causes problems.
In my case the encoding="UTF-16" was causing this issue. It got resolved when I changed it to UTF-8.

using Castor to parse xml based on attribute values

Using Castor to parse the following xml into POJOs using a mapping file is fairly straightforward:
<human name="bob"/>
<dog owner="alice"/>
It uses the name of the element to map to the class. But what if an attribute should be used to do the mapping? e.g.:
<animal type="human" name="bob"/>
<animal type="dog" owner="alice"/>
This contrived example is based on XML that I have to consume (tho I didn't author it!). Any ideas on how to approach this with Castor mapping files?
There are two ways to approach this. Change your Java class structure to have human and dog extend animal, and then write a mapping file for Animal.
Or just use XSLT to transform you data. Something like this might work:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="animal">
<xsl:text disable-output-escaping="yes"><![CDATA[<]]></xsl:text>
<xsl:value-of select="#type" /><xsl:text disable-output-escaping="yes"> </xsl:text>name="<xsl:value-of select="#name" />"
<xsl:text disable-output-escaping="yes"><![CDATA[/>]]></xsl:text>
</xsl:template>
</xsl:stylesheet>

Categories

Resources