Generate PDF files using iText and apache velocity template(.vm)

Generate PDF files using iText and apache velocity template(.vm) - java

What is the general workflow to generate a PDF using iText and an Apache Velocity template file (.vm) in Java?
I am interested in knowing steps like: parse template file, put Java object in context and steps to be performed to generate pdf etc.
I know this is a very basic question. But I am not able to find even a single example of this type on the web. I found XDocReport, but I am interested to know other alternatives as well.
Please help me with some sample project link or at least the steps to get started.

Yes, you can.
It all depends on how complex you want the PDFs to be.
Here are the steps for basic functionality
Generate a HTML file using Apache Velocity template file (.vm).
Use com.itextpdf.text.html.simpleparser.HTMLWorker (deprecated) to parse/convert that HTML file into a PDF.
Additionally, you can use com.itextpdf.text.pdf.PdfCopy.PageStamp to add content (borders, stamps, notes, annotations etc) to an existing PDF.
There is also com.itextpdf.tool.xml.XMLWorker for more advanced HTML conversion (adding style sheets etc)

Generating PDF using iText and an Apache Velocity template file (.vm) in Java directly is not possible because:
PDF is binary format,
Velocity generates plain text content.
On other words, Velocity cannot generate PDF.
XDocReport is able to generate a docx/odt report by merging a docx/odt template which contains some Velocity/Freemarker syntax with Java context. The generated docx/odt report can be convert it to pdf/xhtml.
It works because docx/odt are a zip which contains several xml entries. If you unzip a docx you will see word/document.xml. In this entry, you will see the content that you have typed with MS Word. word/document.xml is a plain text, so Velocity can be used in this case.
Here the XDocReport process to generate pdf from a docx template which uses Velocity:
Load docx template. this step consist to unzip the docx and stores in a map each xml entries (name entry as key and byte array as value). For instance map contains a key with word/document.xml and the xml content of this entry as value.
Loop for each xml entries which must be merged with Java context. For instance word/document.xml is merged with Java context by using Velocity and the result of merge replace the word/document.xml value of the map
Rebuild a new docx by zipping each entries of the map.
At this step we have a generated docx (the report).
To convert it to another format, XDocReport provides a docx-to-pdf converter based on Apache POI and iText. Here the XDocReport process to convert a docx to pdf:
Load docx with Apache POI
Loop for each structures of POI (XWPFParagraph, etc.) to create iText structure (iText Paragraph).
Note that XDocReport is modular and you can use other converters as well.

At first,we use freemarker template to generate a html file,and then render html to a pdf file by IItextRender .Finally, we can view pdf file in browser,there has a very useful javascript tools called pdfjs. Maybe you can try it.

Related

xsl stylesheet for generating word documents from Java

I was wondering if there is any Java API which allows to generate Word document similarly Apache FOP does.
With FOP, it is possible to specify a style sheet which defines the layout of the page in which the data (stored in an xml file) are printed.
The Transformer object within FOP library is in charge of that.
Is there any equivalent API for word document?

With FOP, you can try XML to RTF, which Word accepts.
From their webpage, XMLmind XSL-FO Converter apparently generates:
RTF (can be opened in Word 2000+),
WordprocessingML (can be opened inWord 2003+),
Office Open XML (.docx, can be opened in Word 2007+),
Putting FO to one side, here are 2 different approaches:
The first would be to write an XSLT to convert your XML to Flat OPC XML. Most parts in the Flat OPC XML would simply be copied there by your XSLT. (Generate that template content in Word, using "save as XML"). You'll be focusing mainly on populating the document.xml part. Word can open a Flat OPC XML file, or you can use docx4j (a project I work on) to convert Flat OPC XML to docx.
The second would be to use the docx4j Flying Saucer fork to convert your XML + CSS to docx content. See the code samples. You may need to customise it a bit; one way of feeding it CSS is this file. This actually ought to work pretty well; there is stuff there for mapping class attributes to Word styles, so if you could adorn your XML with class attributes, you could get even better results.

I will assume that your input is an XML document, or at least a CSV file.
1) Create an XSLT stylesheet to transform your input into the Word document format. The result will be a file we will call content.xml. You can apply the stylesheet to the input from Java.
2) Create a MS Word shell and put the content.xml into the shell. There are tools within Apache POI to do this.
I may have taken your question too literally. You might also be able to generate the document using Apache POI API. Also, if your MS Word document doesn't have tables, you can use Apache FOP to generate an RTF document, which can then be easily translated to a .docx file.

Apache POI provides Java APIs for reading and writing Microsoft Excel, Word, and PowerPoint files.
You can checkout POI's Javadocs here.

Merge Annotations (in XFDF format ) into PDF

I want to merge XFDF and PDF.
XFDF file contains all types of Annotations (like text comments, stamps, file attachments etc,.). Is there anyway to merge XFDF and PDF programmatically.
I tried with iText PDF, but not successful.

You can achieve this with the PDFTron PDFNet SDK (not free).
https://www.pdftron.com/pdfnet/samplecode/FDFTest.java.html

Java Utility to convert content of any file to text file.

I am looking for a java utility through which a user can convert any type of file (pdf, doc, docx, xls, xlsx, csv, rtf, txt). We have a requirement in which user can upload any type of file and we need to read the content of the file(only text), convert it and store it in an object. That can be done using Apachi poi but I am wondering if any java utility exists?

You may be interested in Apache Tika, which includes the functionality of Apache POI and PDFBox. From the project description, the toolkit: "detects and extracts metadata and structured text content from various documents using existing parser libraries."

I imagine you can't have some sort of universal function for every type of file. You will need to implement conversion methods for each file type. This link helps with PDF files, and will also give you a template to work with your other file types.

I have a PDF document, but I need to substitute some values in Java

I have an application which generates PDFs. Now I'm using Apache FOP just for generate a document from scratch (XML+XSLT). The question is there some kind of library/method that I can treat my source PDF document as a template?
I mean, I create a document with Adobe Acrobat and just set there some markups like ${Name}, ${Surname}, ${Address} and then I put it into the library providing values for Name, Surname and Address.
Hope you can understand me.
Regards.

PDFBox, iText and PDFlib are PDF libraries that allow you to modify existing PDF files instead of only generating them like FOP does. This would allow you to load the template document and replace the placeholders with the actual values.
http://pdfbox.apache.org/
http://itextpdf.com/
http://www.pdflib.com/
PDFBox also provides sample code on how to replace a string in the document with another value: https://pdfbox.apache.org/apidocs/org/apache/pdfbox/examples/pdmodel/ReplaceString.html

Generation of pdf file using java

I am developing a standalone application in Java. I want to generate a pdf file using Java code. I have a display form in which all the details are fetched from database and displayed in the window. Details are Customer Name, Order Details etc.
Now I want to have a button there which says Convert to pdf.
I want to convert this to pdf file with proper alignment and formatting like tables, font etc.
What can be an ideal way to go about it?

I'd suggest you to use reporting tool like a jasperreports.
JasperReports is entirely written in
Java and it is able to use data coming
from any kind of data source and
produce pixel-perfect documents that
can be viewed, printed or exported in
a variety of document formats
including HTML, PDF, Excel, OpenOffice
and Word.
Have a look at other open source projects (pdf api):
Apache PDFBox
Apache Tika (Toolkit for detecting and extracting metadata and structured text content from various documents using POI and PDFBOX parser libs.)
PDFjet

Use iText:
http://itextpdf.com/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.