how to convert ms-document to PDF, is there any example pls share
with me.. thanks.
If you are requiered to use POI i guess you should take a look at org.apache.poi.hwpf.converter
I never tried this, but i guess it´s worth a try atleast.
It seems like you can use WordToFoConverterto convert your XWPFDocument to a FO-file (example here).
From there you can use apaches FOP to transform the FO-file to a PDF like this:
// Step 1: Construct a FopFactory
// (reuse if you plan to render multiple documents!)
FopFactory fopFactory = FopFactory.newInstance();
// Step 2: Set up output stream.
// Note: Using BufferedOutputStream for performance reasons (helpful with FileOutputStreams).
OutputStream out = new BufferedOutputStream(new FileOutputStream(new File("C:/Temp/myfile.pdf")));
try {
// Step 3: Construct fop with desired output format
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, out);
// Step 4: Setup JAXP using identity transformer
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(); // identity transformer
// Step 5: Setup input and output for XSLT transformation
// Setup input stream
Source src = new StreamSource(new File("C:/Temp/myfile.fo"));
// Resulting SAX events (the generated FO) must be piped through to FOP
Result res = new SAXResult(fop.getDefaultHandler());
// Step 6: Start XSLT transformation and FOP processing
transformer.transform(src, res);
} finally {
//Clean-up
out.close();
}
This Code was taken from https://xmlgraphics.apache.org/fop/0.95/embedding.html incase you want to read more on this topic.
Related
I'm creating an integration solution with Java that filters and modify huge XML files. Those XML files are inputted as a payload document through the solution and to do a big filter in the parts that are interesting for me I want to use XSLT stylesheet.
My difficult is that the default Java solution for this does not work for me (XSLT processing with Java?) as I do not want to take the XML from the system, once it is already in the workflow of the solution, and I need the output source to stay in the workflow.
Element production = docX2.createElement("PRODUCTION");
try {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource("slimmer.xslt");
Transformer transformer = factory.newTransformer(xslt);
Source text = new StreamSource((InputStream) docX1);
transformer.transform(text, new StreamResult((OutputStream) production));
} catch (Exception ex) {
Logger.getLogger(IntProcess.class.getName()).log(Level.SEVERE, null, ex);
}
root.appendChild(production);
docX1 is the XML input document that is flowing through the solution, and docX2 is the output document (both are Document class in Java). Production is a tag element from docX2.
I solve it. With the help of this one Transform XML with XSLT in Java using DOM
My solution is
Element production = docX2.createElement("PRODUCTION");
try {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource("slimmer.xslt");
Transformer transformer = factory.newTransformer(xslt);
Source text = new DOMSource(docX1);
transformer.transform(text, new DOMResult(production));
} catch (Exception ex) {
Logger.getLogger(IntProcess.class.getName()).log(Level.SEVERE, null, ex);
}
root.appendChild(production);
The problem was to try using Stream instead of a DOM source.
I'm trying to transform XML financial data to PDF in Java using xslt and Apache FOP. But I'm getting following exception while transforming XML to PDF with created xsl-fo.
Caused by: org.xml.sax.SAXParseExceptionpublicId: -//W3C//DTD HTML 4.01 Transitional//EN; systemId: http://www.w3.org/TR/html4/loose.dtd; lineNumber: 31; columnNumber: 3; The declaration for the entity "HTML.Version" must end with '>'.
http://www.w3.org/TR/html4/loose.dtd is included in my xslt file. It has really that line without closing tag. I read on https://sourceforge.net/p/saxon/mailman/message/23058335/ that its SGML DTD. I can't transform this xslt to xsl-fo using Apache FOP, because underlying saxon can't parse sgml dtd?
Code for transform xslt to xsl-fo and then xsl-fo to PDF look like following. Could someone tell me, what I'm doing wrong? And how can I transform XML to PDF? Thanks in Advance.
private byte[] generateFOFromXML(Source xslt, Source invoice) throws TransformerException {
ByteArrayOutputStream out = new ByteArrayOutputStream();
try {
//Setup XSLT
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(xslt);
//Setup input for XSLT transformation
//Resulting SAX events (the generated FO) must be piped through to FOP
Result res = new StreamResult(out);
//Start XSLT transformation and FOP processing
transformer.transform(invoice, res);
return out.toByteArray();
} finally {
// try {
// out.close();
// } catch (IOException e) {
// e.printStackTrace();
// }
}
}
byte[] xslFO = generaterFOFromXML(xsltSource, invoiceSource);
FopFactoryBuilder builder = new FopFactoryBuilder(new File(".").toURI());
builder.setStrictFOValidation(false);
FopFactory fopFactory= builder.build();
// FopFactory fopFactory = FopFactory.newInstance(new File(".").toURI());
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
ByteArrayOutputStream tempBAO = new ByteArrayOutputStream();
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, tempBAO);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
transformer.setParameter("versionParam", "2.0");
Result result = new SAXResult(fop.getDefaultHandler());
Source foSource = new StreamSource(new ByteArrayInputStream(xslFO));
transformer.transform(foSource, result);
first time using stack overflow, I'm now a student doing a project for analtics purpose, but the company store all the records into XML and I have to convert it and make a program as so to make it automated report send by email.
I'm using java to do XML parser and I'm now trying Apache common digester as other parser needs XSLT to do that, but iIwant a program that doesn't depends on XSLT because the company wants a system and sends a report like every 5 min of summary. So using XSLT may be quite slow as I saw some of the answer here. So may I know how to do that using digester or other methord, try to show some example the codes if possible, to make a conversion.
Here is the sample code I have build under digester:
public void run() throws IOException, SAXException {
Digester digester = new Digester();
// This method pushes this (SampleDigester) class to the Digesters
// object stack making its methods available to processing rules.
digester.push(this);
// This set of rules calls the addDataSource method and passes
// in five parameters to the method.
digester.addCallMethod("datasources/datasource", "addDataSource", 5);
digester.addCallParam("datasources/datasource/name", 0);
digester.addCallParam("datasources/datasource/driver", 1);
digester.addCallParam("datasources/datasource/url", 2);
digester.addCallParam("datasources/datasource/username", 3);
digester.addCallParam("datasources/datasource/password", 4);
File file = new File("C:\\Users\\1206432E\\Desktop\\datasource.xml");
// This method starts the parsing of the document.
//digester.parse("file:\\Users\\1206432E\\Desktop\\datasource.xml");
digester.parse(file);
}
Another one which is build using DOM to convert CSV to XML but still relies one XSLT file:
public static void main(String args[]) throws Exception {
File stylesheet = new File("C:\\Users\\1206432E\\Desktop\\Style.xsl");
File xmlSource = new File("C:\\Users\\1206432E\\Desktop\\data.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(xmlSource);
StreamSource stylesource = new StreamSource(stylesheet);
Transformer transformer = TransformerFactory.newInstance()
.newTransformer(stylesource);
Source source = new DOMSource(document);
Result outputTarget = new StreamResult(
new File("C:\\Users\\1206432E\\Desktop\\temp.csv"));
transformer.transform(source, outputTarget);
}
Is there any way to make only one xslt transformation and render the output to pdf, png, svg files?
StreamSource contentSource = new StreamSource(xmlContentStream);
StreamSource transformSource = new StreamSource(xslFoMarkupStream);
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
Transformer xslfoTransformer = getTransformer(transformSource);
Fop fop = fopFactory.newFop("application/pdf", foUserAgent, outStream);
Result res = new SAXResult(fop.getDefaultHandler());
// Start XSLT transformation and FOP processing
xslfoTransformer.transform(contentSource, res);
xmlContentStream.close();
xslMarkupStream.close();
return outStream;
In the case above to generate PDF and then PNG I will have to create a new Fop instance with different mime type and then again call xslfoTransformer.transform().
That means that I will have the transformation twice, but I wonder if there is a way to run the transformation once and then render the output to different formats? (Custom Renderer?)
Or maybe there are any suggestions to speed up the rendering as I still need to do it several times - once for PDF, PNG, SVG.
I also tried to generate PDF via FOP and then convert it to image via Apache PdfBox. That works slightly faster, but looks silly.
Thank_you.
You can save one step. Your code does 2 steps above: take some arbitrary XML, transform that into XSL:FO using XSLT and then render the output into whatever format you want. You could do the transformation XML to XSL:FO (probably the slower part) once and use that result as input to 2 FO instances. Something like this:
public void fopReport(OutputStream pdfOut, OutputStream jpgOut, Source xmlSource, Source xsltSource) throws Exception {
// Create the FO content
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(xsltSource);
ByteArrayOutputStream foBytesStream = new ByteArrayOutputStream();
StreamResult foByteStreamResult = new StreamResult(foBytesStream);
transformer.transform(xmlSource, foByteStreamResult);
byte[] foBytes = foBytesStream.toByteArray();
// Render twice
FopFactory fopFactory = FopFactory.newInstance();
FOUserAgent uaPDF = fopFactory.newFOUserAgent();
FOUserAgent uaJpg = fopFactory.newFOUserAgent();
Fop fopPDF = fopFactory.newFop(MimeConstants.MIME_PDF, uaPDF, pdfOut);
Fop fopJpg = fopFactory.newFop(MimeConstants.MIME_JPEG, uaJpg, jpgOut);
//PDF
Source src = new StreamSource(new ByteArrayInputStream(foBytes));
Transformer resultTransformer = factory.newTransformer();
resultTransformer.transform(src, new SAXResult(fopPDF.getDefaultHandler()));
//JPF
src = new StreamSource(new ByteArrayInputStream(foBytes));
resultTransformer = factory.newTransformer();
resultTransformer.transform(src, new SAXResult(fopJpg.getDefaultHandler()));
}
Hope that helps
I have xml-file. I need to read it, make some changes and write new changed version to some new destination.
I managed to read, parse and patch this file (with DocumentBuilderFactory, DocumentBuilder, Document and so on).
But I cannot find a way how to save that file. Is there a way to get it's plain text view (as String) or any better way?
Something like this works:
Transformer transformer = TransformerFactory.newInstance().newTransformer();
Result output = new StreamResult(new File("output.xml"));
Source input = new DOMSource(myDocument);
transformer.transform(input, output);
That will work, provided you're using xerces-j:
public void serialise(org.w3c.dom.Document document) {
java.io.ByteArrayOutputStream data = new java.io.ByteArrayOutputStream();
java.io.PrintStream ps = new java.io.PrintStream(data);
org.apache.xml.serialize.OutputFormat of =
new org.apache.xml.serialize.OutputFormat("XML", "ISO-8859-1", true);
of.setIndent(1);
of.setIndenting(true);
org.apache.xml.serialize.XMLSerializer serializer =
new org.apache.xml.serialize.XMLSerializer(ps, of);
// As a DOM Serializer
serializer.asDOMSerializer();
serializer.serialize(document);
return data.toString();
}
That will give you possibility to define xml format
new XMLWriter(new FileOutputStream(fileName),
new OutputFormat(){{
setEncoding("UTF-8");
setIndent(" ");
setTrimText(false);
setNewlines(true);
setPadText(true);
}}).write(document);