How to transform XMLStreamReader to XMLStreamWriter - java

Should be easy and obvious but I cant find a way - the XMLOutputFactory accepts anly OutputStream, Result or another Writer to generate a new XMLStreamWriter. What I have at hand is an XMLStreamReader which has no methods for extracting a Result or an OutputStream.
If the solution would be easier using the Event API, that would be OK too.
Thank you

You could use a javax.xml.transform.Transformer to convert a StAXSource wrapping the reader to a StAXResult wrapping the writer.
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
StAXSource source = new StAXSource(xmlStreamReader);
StAXResult result = new StAXResult(xmlStreamWriter);
t.transform(source, result);
Using the Event API you could also use the folloiwng:
http://download.oracle.com/javase/6/docs/api/javax/xml/stream/XMLEventWriter.html#add(javax.xml.stream.XMLEventReader)

Related

Convert org.w3c.dom.Document to File file

I have a xml file as object in Java as org.w3c.dom.Document doc and I want to convert this into File file. How can I convert the type Document to File?
thanks
I want to add metadata elements in an existing xml file (standard dita) with type File.
I know a way to add elements to the file, but then I have to convert the file to a org.w3c.dom.Document. I did that with the method loadXML:
private Document loadXML(File f) throws Exception{
DocumentBuilder b = DocumentBuilderFactory.newInstance().newDocumentBuilder();
return builder.parse(f);
After that I change the org.w3c.dom.Document, then I want to continue with the flow of the program and I have to convert the Document doc back to a File file.
What is a efficient way to do that? Or what is a better solution to get some elements in a xml File without converting it?
You can use a Transformer class to output the entire XML content to a File, as showed below:
Document doc =...
// write the content into xml file
DOMSource source = new DOMSource(doc);
FileWriter writer = new FileWriter(new File("/tmp/output.xml"));
StreamResult result = new StreamResult(writer);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.transform(source, result);
With JDK 1.8.0 a short way is to use the built-in XMLSerializer (which was introduced with JDK 1.4 as a fork of Apache Xerces)
import com.sun.org.apache.xml.internal.serialize.XMLSerializer;
Document doc = //use your method loadXML(File f)
//change Document
java.io.Writer writer = new java.io.FileWriter("MyOutput.xml");
XMLSerializer xml = new XMLSerializer(writer, null);
xml.serialize(doc);
Use an object of type OutputFormat to configure output, for example like this:
OutputFormat format = new OutputFormat(Method.XML, StandardCharsets.UTF_8.toString(), true);
format.setIndent(4);
format.setLineWidth(80);
format.setPreserveEmptyAttributes(true);
format.setPreserveSpace(true);
XMLSerializer xml = new XMLSerializer(writer, format);
Note that the classes are from com.sun.* package which is not documented and therefore generally is not seen as the preferred way of doing things. However, with javax.xml.transform.OutputKeys you cannot specify the amount of indentation or line width for example. So, if this is important then this solution should help.

Transforming Streaming XSLT Without a Custom Content Handler

Take a look at this website:
http://xmpp.wordpress.com:8008/firehose.xml?type=text/plain
It constantly streams data. You can transform this content using the newest version of XSLT (v3), with a command like this:
<xsl:stream href="http://xmpp.wordpress.com:8008/firehose.xml?type=text/plain">
If I want to write some Java code to initiate the transformation (using Saxon, which has implemented xsl:stream), I can do this:
// XSL
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer(new StreamSource(new FileInputStream(xslFile)));
// XML
StreamSource xmlSource = new StreamSource(new FileInputStream(xmlFile));
// Output
MyCustomContentHandler handler = new MyCustomContentHandler();
PrintStream outputPrintStream = new PrintStream(new BufferedOutputStream(new FileOutputStream(outputFile)), true);
handler.setPrintStream(outputPrintStream);
Result result = new SAXResult(handler);
// Transform
transformer.transform(xmlSource, result);
This works. If you let it run for a bit, then open the output file, you’ll see data in it. If you re-open it a bit later, you’ll see even more data. The key to this is the custom content handler that processes the various SAX events.
But suppose that I don’t really want a custom content handler. Suppose I just want to keep the output of the XSLT as is. I can modify my code like this:
// XSL
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer(new StreamSource(new FileInputStream(xslFile)));
// XML
StreamSource xmlSource = new StreamSource(new FileInputStream(xmlFile));
// Output
TransformerHandler transformerHandler = ((SAXTransformerFactory) SAXTransformerFactory.newInstance()).newTransformerHandler();
transformerHandler.setResult(new StreamResult(new PrintWriter(new FileOutputStream(outputFile, true), true)));
// or this…
//transformerHandler.setResult(new StreamResult(new FileOutputStream(outputFile)));
// or this…
//transformerHandler.setResult(new StreamResult(new FileWriter(outputFile)));
ContentHandler contentHandler = (ContentHandler) transformerHandler;
SAXResult result = new SAXResult(transformerHandler);
// Transform
transformer.transform(xmlSource, result);
The good news is that I no longer need a custom content handler, and my output now matches the output of the XSLT exactly. The bad news is that although this code works with non-streaming XSLT, it does not work with streaming XSLT. Despite my various attempts at setting the result (see the “or this…” statements above), nothing is written to the file. I suspect there’s a buffering problem of some sort.
Question: How can I combine the best of these two together? How can I transform a streaming XSLT without having to use a custom content handler?
This seems to be a rerun of a thread on the saxon-help list in June:
http://sourceforge.net/p/saxon/mailman/message/32472658/
The conclusion there was that the output was somehow being buffered in the output stream pipeline. Saxon is emitting events representing the transformation result, as you see by supplying a ContentHandler, but the serialization of these events is being buffered in the I/O system.
At this time, it does not appear to be possible to do what I want to do. My current solution is to use a custom content handler (per my question above) and run its results through a standard XSLT identity transformation. A bit ugly and not very efficient, but it works.

How to use Java API to parse xml string with XSLT and generate output in memory only?

I need to parse the internal XML(from a response) with predefined XSLT and send back the parsed result in html to the client. I notice the following example to use and generate local files. How to avoid the file creation with Java API? I want to replace the source.xml with String and generate the html output on the fly.
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer (new javax.xml.transform.stream.StreamSource("searchresult.xslt"));
transformer.transform(new javax.xml.transform.stream.StreamSource("source.xml"),
new javax.xml.transform.stream.StreamResult( new FileOutputStream("result.html")));
StreamSource has a constructor taking a Reader as argument. You can thus pass a StringReader, which will read the XML from a String, as argument.
Similarly, the StreamResult constructor the example uses takes an OutputStream as argument. You can thus pass any kind of OutputStream (like the HTTP response output stream, or a ByteArrayOutputStream, or a socket output stream) to send the result to wherever you like.

Transforming a StAX Source in Java

I have some code like:
XMLInputFactory xif = XMLInputFactory.newInstance()
TransformerFactory tf = TransformerFactory.newInstance("org.apache.xalan.processor.TransformerFactoryImpl", null)
Transformer t = tf.newTransformer()
DOMResult result = new DOMResult()
t.transform(new StAXSource(reader), result)
Which produces the following error:
Caught: javax.xml.transform.TransformerException: Can't transform a Source of type javax.xml.transform.stax.StAXSource
The reader object is of type com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl
So I was stupidly trying the wrong thing. This fixed it:
System.setProperty("javax.xml.transform.TransformerFactory",
"com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl");
I stumbled upon Xml Transformer give me an error when trying to transform a StaxSource into a StreamResult, while looking to solve the very same issue as you.
The answer provided there seems to be working fine for me, i.e. use:
TransformerFactory.newDefaultInstance();
Instead of:
XMLInputFactory.newInstance()

Simpler way to transform a DOMSource into a StreamSource?

I need to transform a DOMSource into a StreamSource, because a third-party library only accepts stream sources for SOAP.
Performance is not so much of an issue in this case, so I came up with this horribly verbose set of commands:
DOMSource src = new DOMSource(document);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
StreamResult result = new StreamResult();
ByteArrayOutputStream out = new ByteArrayOutputStream();
result.setOutputStream(out);
transformer.transform(src, result);
ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray());
StreamSource streamSource = new StreamSource(in);
Isn't there a simpler way to do this?
This is as good a way as any. Because your third party library only accepts XML in lexical form, you have no alternative but to serialize the DOM so that the external library can re-parse it. Stupid design - tell them so.

Categories

Resources