All the documents generated in our application are generated with java-11 + opensagres
/xdocreport-2.0.2 + Freemarker template engine.
The documents are generated correctly in multiple languages like: Russian and Chinese.
We've observed that when the input is in Cambodian language the Word document generated contains some utility boxes instead of Cambodian characters.
I've explained more in detail the issue here: https://github.com/opensagres/xdocreport/issues/575 , but I didn't receive any answer until now.
Did anyone manage to generate documents containing this language with opensagres ?
Thanks upfront!
The answer was, using Aspose framework(this is not free like opensagres).
The biggest advantages are that in Aspose you can force the framework to use some sets of fonts from the application resources and other great features(like smooth and simple pdf convertions).
The only trouble was that Aspose doesn't have integration with Freemarker template. In our case that meant changing a lot of quite big complex existing documents.
After some analyses and based on Aspose really kind support, we took the decision to use a hybrid solution like:
Documents would be still generated in memory with Opensagres and Freemarker
After that the documents will be loaded with Aspose, and render based on the application resources fonts. The native font for Cambodian characters is Daunpenh Font. This font was placed in application resources.
The full topic can be found here: https://forum.aspose.com/t/support-cambodian-language/252057
Related
I am searching for a way for my Java application to generate Word document using some kind of Template (the data for the document will be provided by the application)
Here are the requirements :
- The template should be editable for a non-developper human being. Creating a Jasper template using the adequate tool or editing a Word document with some kind of templating language is compliant. Asking for editing the xml file of the document is not
- The results should be easily editable for a human being, using Microsoft Word. For example, the document generated by Jasper or Birt is not compliant, as the table layout prevent any easy edition.
For the moment, I looked at the following solutions, finding no one which match the two requirements :
Jasper. The document generated are not easily edited
Birt. Same Problem
Generating the xml using a template motor (velocity, Freemarker). I cannot ask for the final client to edit this kind of XML file...
You can check out Templater. It has pretty good demo page.
Disclamer: I'm the author.
LibreOffice
LibreOffice is an open-source implementation of an app suite similar to Microsoft Office. Besides supporting the standardized OpenDocument format, it also reads and writes Microsoft Office formats.
LibreOffice offers a Java API. So you may be able to programmatically create documents from a template.
In the past we’ve done something similar, modifying a document with search-and-replace and document-variables.
Apache Poi
Apache Poi is an open-source library for reading and writing Microsoft Office compatible documents.
I don't know its details but you might take a look.
JODReports (open source) and Docmosis (commercial) are designed to use normal/human-managed documents as templates (Word, OpenOffice, etc), merge in your data and return editable documents, PDFs etc. Please note I work for Docmosis.
Both JODReports and Docmosis provide a Java API.
If you are interested in automating Open Office or Libre Office directly (as mentioned in Basil's answer), this blog about converting Doc to Pdf will give you a quick-start to:
load a doc file as a template
search and replace
export to file (pdf in the example)
To change the output format to Doc instead of PDF:
propertyValues[1].Value = "writer_pdf_Export";
to
propertyValues[1].Value = "MS Word 97";
I hope that helps.
Was searching for this kind of solution as well, and I found XDocReport, including an example of a table. I will give it a try.
In my application the user can configure their own table layouts to display the data on the screen, by choosing which colums in which order are to be shown. Now I want to give my users the possibility to export these tables to PDF. The tables should fill the page width completely and the columns should adjust their size depending on the content in the table, like e.g. HTML tables do.
Can you recomment a library or toolchain for this?
I checked Apache FOP, but their fo:tables do not support automatic table layout. Creating a Jasper Report dynamically also doesn't seem to fit, because I have to specify the exact column width there. So, does anyone have an idea how to achive this?
Open Source solutions with commercially friendly licenses like Apache or LGPL preferred.
ANSWER: There are no current tools that allow what I hoped for, so I mark this question as resolved.
iText is probably what you are looking for http://itextpdf.com/. Version 2.1.7 is free to use under the GPL lisence, for more info see this question What is latest version of itext that is not AGPL?. It will work for you need, I would recommend using their code samples on their web page. Most are applicable to version 2.1.7 although it is currently on version 5.
I would recommend you take a look at Flying Saucer. It is an open source project that uses iText PDF as its core rendering engine but it allows you to define very dynamic xhtml files as your rendering medium.
Benefits of using FS+iText using iText by itself
Allows for a much faster dev cycle for changes or new products
Very easy to add in consistent and complex styles using a language built exactly for it (HTML + CSS)
You can use the same HTML code to render your PDF as you use to view it online (if using web application)
Can render graphics using java.awt.Graphics class onto the PDF, meaning you can integrate any graphics library with FS to paint objects like graphs onto your PDF.
Downsides:
While it does a good job at rendering modern css styles it is not perfect. For instance when I used it the project required the border-radius style to be usable. It wasnt so I had to implement it.
Rendering times and memory consumption are increased (Although I have found it to be quite fast and memory efficient)
More libraires in your project.
Heres an example from the FS website if your ineterested.
https://github.com/flyingsaucerproject/flyingsaucer/blob/master/flying-saucer-examples/src/main/java/PDFRender.java
[Background Info]
We had a solution in place to use Word automation serverside to convert HTM documents into Docx, PDF or Print documents. This solution broke in the latest version of Windows Server 2012. We learned that MS does not intend on Word working in this manner and after trouble shooting with MS support Engineers we have come to the conclusion that it will never work.
[Currently]
I am currently researching potential technologies and tools that my company can use to regain this functionality. We need to be able to create Docx, PDF and print files to a local printer.
I have looked into a number of tool already and I am currently leaning towards Apache FOP this seems to handle PDF and Printing for us.
However, I'm looking for some advice and suggested tools that we could use to implement a pure Java approach. Currently our application creates HTM files with all the required information. So ideally we would like to take these HTM files and "Convert" them into Docx/XLS-FO format.
[Question]
So my question that I'm hoping you will be able to help me with.
What is the best tools that I can use to get from
HTM to Docx
HTM to PDF
Or what would be the best process for achieving this? has anyone had success finding a solution for this in the past?
Thank You
It depends on the level of control and the complexity of the source HTML. There are HTML to FO stylesheets but you might find them wanting for your specific need.
So you could use the Jericho parser to read the HTML and generate FO. Or you generate the target format directly using Apache PDFBox and Apache POI
It all boils down to the level of control you want/need
docx4j-ImportXHTML will get you from XHTML to docx. From there, you can use docx4j (or some other solution eg LibreOffice/OpenOffice) to do docx to PDF.
docx4j supports docx to XSL FO, and by default uses FOP.
I have some PDF template (with header and footer). I want to generate documents that are based on that template.
Is there any way to do that with iText? Thank you
P.S. Right now I am generate a document on-fly i.e. every time I generate header, footer and the context itself.
UPDATE: I have found incredible library called PD4ML. It's not free, but not such expensive, BUT it has really cool features such as HTML2PDF conversion on fly, supports a lot of HTML-CSS tags and has even its own jsp tags library! So I really suggest it when you need something instead of heavy and memory-eating JasperReports.
You can use JasperReports library and the iReport visual designer.
JasperReports use iText to produce PDFs from "jasper" templates, that are XML files (following the jrxml DTD) compiled in java classes, but allows you to use the template for generating MS Office files (with POI), html, etc.
Im not sure with iText, but you can use BIRT for this purpose. http://www.eclipse.org/birt/ Its too much using it just for PDF creation, you can do a lot (more than you can imagine) with it.
If you can choose your template format. I would go with JODReport and JODConverter.
JODReport use an ODT template and fill the mapping in the template with your java code.
JODConverter use LibreOffice to convert such template in PDF or whatever fortmat LibreOffice can handle to export.
You have to be able to use LibreOffice as a service installed remotely on a machine.
I used it back in 2012 but not sure if the project is still active
I am using PDF documents for various purposes using iText library.
Its like one class per PDF document. In a way there are a lot of similarities among the classes and the same have been listed below:
The fields have (x,y) location
The field can be wrapped after some no. of words
A field can have a value which is a function of one or more parameters
Subsequent page of PDF has to kept same or different
I am thinking of doing this layout business through a XML file. Any thoughts or innovative ideas of solving this are welcome.
take a look at PDFBox Library which is now in the incubator of Apache
PDFBox is nice, Used it before and good good help from the developer. You might want to have a look at XSL:FO. It is an XML based formatting language that can output the result as PDF (and other formats) using Apache:FOP.
What about Prince? It's a FOP engine that uses CSS files as styling, and has a Java API. It's not free though (apart from the free Personal License)
Flying Saucer supports using XHTML/CSS to create PDFs.