Using XSLT to output multiple files - java

I'm trying to get an example that I found for using XSLT 2.0 to output multiple files working.
Using Saxon B 9.7.0.1 with Java 1.6, I get this error:
C:\Documents and Settings\Administrator\Desktop\saxon>java -jar saxon9.jar -s:input.xml -xsl:transform.xml
Error on line 15 of transform.xml:
java.net.URISyntaxException: Illegal character in path at index 20: file:///C:/Documents
and Settings/Administrator/Desktop/saxon/output1/test1.html
at xsl:for-each (file:/C:/Documents%20and%20Settings/Administrator/Desktop/saxon/transform.xml#10)
processing /tests/testrun[1]
Transformation failed: Run-time errors were reported
input.xml
<?xml version="1.0" encoding="UTF-8"?>
<tests>
<testrun run="test1">
<test name="foo" pass="true" />
<test name="bar" pass="true" />
<test name="baz" pass="true" />
</testrun>
<testrun run="test2">
<test name="foo" pass="true" />
<test name="bar" pass="false" />
<test name="baz" pass="false" />
</testrun>
<testrun run="test3">
<test name="foo" pass="false" />
<test name="bar" pass="true" />
<test name="baz" pass="false" />
</testrun>
</tests>
transform.xml
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="text"/>
<xsl:output method="html" indent="yes" name="html"/>
<xsl:template match="/">
<xsl:for-each select="//testrun">
<xsl:variable name="filename"
select="concat('output1/',#run,'.html')" />
<xsl:value-of select="$filename" /> <!-- Creating -->
<xsl:result-document href="{$filename}" format="html">
<html><body>
<xsl:value-of select="#run"/>
</body></html>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Character 20 in your URI is the first space in "Documents and Settings". As a quick fix, try moving the files to a path without spaces. (Say, "C:\test" or some such.) I suspect the long-term fix is to change your XSLT to encode spaces to %20 before feeding $filename to xsl:result-document, but I'm afraid my XSLT-2.0-fu isn't strong enough to tell you how.
Edit: I haven't tested this, as I don't have an XSLT 2.0 processor handy, but after glancing at the docs, it looks like you want the encode-for-uri function. Something like the following may work for you:
<xsl:result-document href="{fn:encode-for-uri($filename)}" format="html">

I had the same issue with saxon -o: outputfile replacing the spaces with %20..
found out the issue is saxon and java versions.
Linux JAVA 1.7.0_45 : Saxon creates %20
Unix JAVA 1.5.0_61 : SAXON creates %20
Unix JAVA 1.4.2_22 : SAXON Does Not creates %20 directory

Related

If regEx is bad for matching XML, what is the correct way?

I was trying to do a simple string delete in XML.
I want to delete something like the following.
<A>
<B>Test Name</B>
</A>
Has to work with all possible XML, though.
<Test><A><B>Test Name</B></A></Test>
<Test ><A ><B >Test Name</B ></A ></Test >
<Test>
<A>
<B>Test Name</B>
</A>
</Test>
etc, etc.
The regularEX I got so far, is simply:
<A>\s*(\r\n|\r|\n)*\s*<B>Test Name<\/B>\s*(\r\n|\r|\n)*\s*<\/A>
Everyone always says regEx is bad for match XML, which it clearly is. So what should I use instead.
GC_
The best approach for this case would be using XSLT. And even with XSLT-1.0 this is simple (You can use the Java XSLT-processor, linux'es xsltproc or any other XSLT processor; every XSLT processor supports at least XSLT-1.0):
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*" />
<!-- identity template - matches everything except the things matched by other templates -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Removes the elements you do not want -->
<xsl:template match="A[B[normalize-space(.)='Test Name']]" />
</xsl:stylesheet>
The output of your sample case (with a hypothetical root element) would be
<Test/>
<Test/>
<Test/>
Trying to use RegEx would be error-prone and no good-practice at all.
Why would you make it complicated if it could be so easy?

XSL - "Can not compile stylesheet" error, but the syntax is fine [duplicate]

This question already has answers here:
Java Transformer error: Could not compile stylesheet
(3 answers)
Closed 5 years ago.
I got this error when I do this in Java :
InputStream foo = getClass().getResourceAsStream(XML_TRANSFORM);
javax.xml.transform.Transformer transformer = factory.newTransformer(new StreamSource(foo)); // error
I can not figure out where the error is, because I have no more information and because XML online validators tell me that there is no syntax error (ex : https://www.w3schools.com/xml/xml_validator.asp)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<TimeSeriesDoc>
<xsl:apply-templates select="/eai-document" />
<xsl:apply-templates select="eai-timeserie" />
</TimeSeriesDoc>
<xsl:template match="/eai-document">
<DocumentHeader>
<Identification v="{#identification}" />
<Version v="{#version}" />
<Type v="{#type}" />
<Status v="{#status}" />
<ClassificationType v="{#classification}" />
<CreationDateTime v="{#creation-date-time}" />
<ProcessType v="{#process-type}" />
<Initiator v="{#initiator}" />
<InitiatorRole v="initiator-role}" />
<Receptor v="{#receptor}" />
<ReceptorRole v="{#receptor-role}" />
<ElaborateDateTime v="{#elaborate-date-time}" />
</DocumentHeader>
</xsl:template>
<xsl:template match="eai-timeserie">
<TimeSeries>
<Identification v="{#identification}" />
<BusinessType v="{#business-type}" />
<Product v="{#product}" />
<ObjectAggregation v="{#object-agregation}" />
<AreaIdentification v="{#area-id}" />
<ResourceObject v="{#resource-object}" />
<MeasureUnit v="{#measure-unit}" />
<CurveType v="{#curve-type}" />
<xsl:apply-templates select="eai-period" />
</TimeSeries>
</xsl:template>
<xsl:template match="eai-period">
<Period>
<Resolution v="{#resolution}" />
<xsl:apply-templates select="eai-point" />
</Period>
</xsl:template>
<xsl:template match="eai-point">
<Pt>
<P v="{#position}" />
<Q v="{#quantity}" />
</Pt>
</xsl:template>
</xsl:stylesheet>
There's a slightly odd rule in XSLT which means that if you had
<x:TimeSeriesDoc>
<xsl:apply-templates select="/eai-document" />
<xsl:apply-templates select="eai-timeserie" />
</x:TimeSeriesDoc>
at the top level of your stylesheet then it would be valid (but ignored and useless), while if you have
<TimeSeriesDoc>
<xsl:apply-templates select="/eai-document" />
<xsl:apply-templates select="eai-timeserie" />
</TimeSeriesDoc>
then it is invalid (and an error).
If the processor doesn't tell you where the error is then that's probably because your Java application is sending the system error output stream to somewhere where you aren't seeing it. Either fix this problem in your application, or try debugging your stylesheets using an environment such as oXygen before you compile them from a Java application. Since your error suggests a lack of familiarity with the basics of the XSLT language, you would really benefit from using a tool such as oXygen (or Stylus Studio or XML Spy) for development.

Parsing XML with structured element names

I've got some third party XML to parse in the following form. The number of tests is unbounded, but always an integer.
<tests>
<test_1>
<foo bar="baz" />
</test_1>
<test_2>
<foo bar="baz" />
</test_2>
<test_3>
<foo bar="baz" />
</test_3>
</tests>
I'm currently parsing this with XPath, but it's a lot of messing around. Is there any way of expressing this style of XML in a XSD schema and generating JAXB classes from it.
As far as I can see this is impossible, the only thing possible is the <xs:any processContents="lax"/> technique from
how can I define an xsd file that allows unknown (wildcard) elements?
, however this allows any content, not specifically <test_<integer>. I just want to confirm I'm not missing some XSD/JAXB trick?
Note I would have preferred the XML to be structured like this. I may try to convince the third-party to change.
<tests>
<test id="1">
<foo bar="baz" />
</test>
<test id="2">
<foo bar="baz" />
</test>
<test id="3">
<foo bar="baz" />
</test>
</tests>
While there are ways of dealing with elements with structured names such as numeric suffixes,
XPath: Use string tests against name() or local-name()
XSD: See XSD element name pattern matching
JAXB: See Dealing with poorly designed XML with JAXB
you really should fix the underlying XML design (test_1 should be test) instead.
For completeness here is full working example of using XSLT to transform the <test_N> input into <test id="N"> style
<tests>
<test_1>
<foo bar="baz" />
</test_1>
<test_2>
<foo bar="baz" />
</test_2>
<test_1234>
<foo bar="baz" />
</test_1234>
<other>
<foo></foo>
</other>
</tests>
XSL
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="*[substring(name(), 1, 5) = 'test_']">
<xsl:element name="test">
<xsl:attribute name="id"><xsl:value-of select="substring(name(), 6, string-length(name()) - 5)" /></xsl:attribute>
<xsl:copy-of select="node()" />
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Code
File input = new File("test.xml");
File stylesheet = new File("test.xsl");
StreamSource stylesource = new StreamSource(stylesheet);
Transformer transformer = TransformerFactory.newInstance().newTransformer(stylesource);
StringWriter writer = new StringWriter();
transformer.transform(new StreamSource(input), new StreamResult(writer));
System.out.println(writer);
Output
<?xml version="1.0" encoding="UTF-8"?>
<tests>
<test id="1">
<foo bar="baz"/>
</test>
<test id="2">
<foo bar="baz"/>
</test>
<test id="1234">
<foo bar="baz"/>
</test>
<other>
<foo/>
</other>
</tests>

XSLT: set name of transformed output file

I am using Apache Camel file component and xslt component. I have a route where i pickup a xml message, transform using xslt and drop to a different folder.
Apache camel DSL route:
<route id="normal-route">
<from uri="file:{{inputfilefolder}}?consumer.delay=5000" />
<to uri="xslt:stylesheets/simpletransform.xsl transformerFactoryClass=net.sf.saxon.TransformerFactoryImpl" />
<to uri="file:{{outputfilefolder}}" />
</route>
I am mentioning Apache camel also here , to check if there is a way to set the output file name using Camel. I think, even without Camel, there would be a mechanism with pure XSLT.
I need to rename the transformed output file. But always i am getting the same input filename with the transformed content, in the output folder.
eg: input file: books.xml
output file: books.xml [with the transformation applied]
What i am looking for is someotherfilename.xml as the output filename. The output data is correct.
I tried <xsl:result-document href="{title}.xml"> , but then the output xml is blank. Please help.
Input XML file:
<?xml version="1.0" encoding="UTF-8"?>
<books>
<book.child.1>
<title>Charithram</title>
<author>P Sudarsanan</author>
</book.child.1>
<book.child.2>
<title>Java Concurrency</title>
<author>Joshua Bloch</author>
</book.child.2>
</books>
XSLT:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="xml" version="1.0" encoding="UTF-8"
indent="yes" />
<xsl:variable name="filename" select="'newfilename'" />
<xsl:template match="/">
<xsl:result-document href="{$filename}.xml">
<traders>
<xsl:for-each select="books/*">
<trade>
<title>
<xsl:value-of select="title" />
</title>
</trade>
</xsl:for-each>
</traders>
</xsl:result-document>
</xsl:template>
</xsl:stylesheet>
Output XML when using <xsl:result-document href="" in XSLT
it is blank..
Output XML when not using <xsl:result-document href="" in XSLT
<?xml version="1.0" encoding="UTF-8"?>
<traders xmlns:xs="http://www.w3.org/2001/XMLSchema">
<trade>
<title>Charithram</title>
</trade>
<trade>
<title>Java Concurrency</title>
</trade>
</traders>
Edit: edited the XSLT as per MartinHonnen's comment
Looks like Camel's default is to use the same file name, but you can override it. As the docs mention you can specify the options of interest as follows:
file:directoryName[?options]
One such option is fileName:
Use Expression such as File Language to dynamically set the filename.
For consumers, it's used as a filename filter. For producers, it's
used to evaluate the filename to write.
In short, modify your route as follows:
<route id="normal-route">
<from uri="file:{{inputfilefolder}}?consumer.delay=5000" />
<to uri="xslt:stylesheets/simpletransform.xsl transformerFactoryClass=net.sf.saxon.TransformerFactoryImpl" />
<to uri="file:{{outputfilefolder}}?fileName=foo.xml" />
</route>
Where foo.xml will be the output file.
Update
You can use Simple or File language to set file names dynamically. There are a few examples in the links.

Calling dateNow() with XALAN

am using XSL which is being called by JAVA method, I have tried to fix it by giving absolute path of a class but I don't think it will work because I didn't find anywhere calling a method in XSL using absolute class path so am trying by keeping in the server environment.here is my code,i have given class path and i have called method also .. but am not getting the proper output. Is this the correct way to call a method?
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xalan="http://xml.apache.org/xalan"
xmlns:datetime="java:com.ibm.date"
exclude-result-prefixes="xalan"
version="1.0">
<xsl:output method="xml" encoding="UTF-8" omit-xml-declaration="yes" indent="no" />
<xsl:strip-space elements="*" />
<xsl:template name="RootToAcknowledgeInventoryRequirement">
<xsl:param name="Root" />
<xsl:variable name="PromiseHeader" select="$Root/PromiseHeader" />
<xsl:variable name="today" select="datetime:dateNow()" />
<xsl:variable name="OrganizationCode">
<xsl:value-of select="$PromiseHeader/#OrganizationCode" />
</xsl:variable>
<_inv:AcknowledgeInventoryRequirement releaseID="">
<_wcf:ApplicationArea>
<oa:CreationDateTime xsi:type="udt:DateTimeType">
<xsl:value-of select="datetime:dateNow()" />
</oa:CreationDateTime>
</_wcf:ApplicationArea>
As you have mentioned XLAN processor, its version is XSLT 1.0.
So by default it doesn't possess any date-time function that brings it a current datetime value.
It is XSLT 2.0 that supports
<xsl:value-of select="current-dateTime()"/>
<xsl:value-of select="current-date()"/>
<xsl:value-of select="current-time()"/>
you should go with Machael Kay's Saxon latest version for that.
Now for work around, EXSLT has been into good practice:
Download code from this link, copy date.xsl to your xsl file's location. Import it in your xsl.
<xsl:stylesheet version="1.0"
xmlns:date="http://exslt.org/dates-and-times"
extension-element-prefixes="date"
...>
<xsl:import href="date.xsl" />
<xsl:template match="//root">
<xsl:value-of select="date:date-time()"/>
</xsl:template>
</xsl:stylesheet>
solution 2: Pass Datetime value as param to XSL. Use it as variable wherever required.

Categories

Resources