In my project I have JSP pages that should print PDF documents
You can use iText, fop or other.
Take a look at:
Compare these products for PDF generation with Java given requirements inside: iText, Apache PDFBox or FOP?
Are there any Java PDF creation alternatives to iText?
See the following which will convert HTML to PDF.
https://code.google.com/p/flying-saucer/
Easier than working with iText directly.
Related
What is the general workflow to generate a PDF using iText and an Apache Velocity template file (.vm) in Java?
I am interested in knowing steps like: parse template file, put Java object in context and steps to be performed to generate pdf etc.
I know this is a very basic question. But I am not able to find even a single example of this type on the web. I found XDocReport, but I am interested to know other alternatives as well.
Please help me with some sample project link or at least the steps to get started.
Yes, you can.
It all depends on how complex you want the PDFs to be.
Here are the steps for basic functionality
Generate a HTML file using Apache Velocity template file (.vm).
Use com.itextpdf.text.html.simpleparser.HTMLWorker (deprecated) to parse/convert that HTML file into a PDF.
Additionally, you can use com.itextpdf.text.pdf.PdfCopy.PageStamp to add content (borders, stamps, notes, annotations etc) to an existing PDF.
There is also com.itextpdf.tool.xml.XMLWorker for more advanced HTML conversion (adding style sheets etc)
Generating PDF using iText and an Apache Velocity template file (.vm) in Java directly is not possible because:
PDF is binary format,
Velocity generates plain text content.
On other words, Velocity cannot generate PDF.
XDocReport is able to generate a docx/odt report by merging a docx/odt template which contains some Velocity/Freemarker syntax with Java context. The generated docx/odt report can be convert it to pdf/xhtml.
It works because docx/odt are a zip which contains several xml entries. If you unzip a docx you will see word/document.xml. In this entry, you will see the content that you have typed with MS Word. word/document.xml is a plain text, so Velocity can be used in this case.
Here the XDocReport process to generate pdf from a docx template which uses Velocity:
Load docx template. this step consist to unzip the docx and stores in a map each xml entries (name entry as key and byte array as value). For instance map contains a key with word/document.xml and the xml content of this entry as value.
Loop for each xml entries which must be merged with Java context. For instance word/document.xml is merged with Java context by using Velocity and the result of merge replace the word/document.xml value of the map
Rebuild a new docx by zipping each entries of the map.
At this step we have a generated docx (the report).
To convert it to another format, XDocReport provides a docx-to-pdf converter based on Apache POI and iText. Here the XDocReport process to convert a docx to pdf:
Load docx with Apache POI
Loop for each structures of POI (XWPFParagraph, etc.) to create iText structure (iText Paragraph).
Note that XDocReport is modular and you can use other converters as well.
At first,we use freemarker template to generate a html file,and then render html to a pdf file by IItextRender .Finally, we can view pdf file in browser,there has a very useful javascript tools called pdfjs. Maybe you can try it.
I have an application which generates PDFs. Now I'm using Apache FOP just for generate a document from scratch (XML+XSLT). The question is there some kind of library/method that I can treat my source PDF document as a template?
I mean, I create a document with Adobe Acrobat and just set there some markups like ${Name}, ${Surname}, ${Address} and then I put it into the library providing values for Name, Surname and Address.
Hope you can understand me.
Regards.
PDFBox, iText and PDFlib are PDF libraries that allow you to modify existing PDF files instead of only generating them like FOP does. This would allow you to load the template document and replace the placeholders with the actual values.
http://pdfbox.apache.org/
http://itextpdf.com/
http://www.pdflib.com/
PDFBox also provides sample code on how to replace a string in the document with another value: https://pdfbox.apache.org/apidocs/org/apache/pdfbox/examples/pdmodel/ReplaceString.html
I am developing a standalone application in Java. I want to generate a pdf file using Java code. I have a display form in which all the details are fetched from database and displayed in the window. Details are Customer Name, Order Details etc.
Now I want to have a button there which says Convert to pdf.
I want to convert this to pdf file with proper alignment and formatting like tables, font etc.
What can be an ideal way to go about it?
I'd suggest you to use reporting tool like a jasperreports.
JasperReports is entirely written in
Java and it is able to use data coming
from any kind of data source and
produce pixel-perfect documents that
can be viewed, printed or exported in
a variety of document formats
including HTML, PDF, Excel, OpenOffice
and Word.
Have a look at other open source projects (pdf api):
Apache PDFBox
Apache Tika (Toolkit for detecting and extracting metadata and structured text content from various documents using POI and PDFBOX parser libs.)
PDFjet
Use iText:
http://itextpdf.com/
I was looking at using iText to create both a pdf and html version of a document with RTF as a possible option. According to this question this is no longer possible with iText. Is there a library that will allow me to create a document in Java and output it as both PDF and HTML? The ability to output RTF would be nice but is not required.
As that answer to the other question states, you can just use the iText RTF Library.
I have used PD4ML to convert HTML to pdf. Even though it is a commercial app. It is very reliable and supports CSS well.
JasperReports. If you look at this package it supports export to:
pdf
html
rtf
xls
xml
You have two options to create the documents:
via iReport - a visual designer for reports
via an API, where you construct everything with Java code.
Note that even though JasperReports's main function is to create reports, it can very well create other documents, with no tabular data for example.
You could also try Docmosis since that supports the output formats provided by OpenOffice (including the ones you specified) and you can often do the job with a lot less code.
Does anybody know of a open source Java library that will do robust diffing of the text parts of pdf files?
Ideally I would like something that would produce a diff in the form of a patch.
Extract the pdf text with http://incubator.apache.org/pdfbox/ and create a diff with http://code.google.com/p/google-diff-match-patch.
If the PDFs are different only in text, you could also rasterize the pages and then look at the differences that way - we use that for regression testing output on our PDF code.
You can take a look of xdiffweb.com. It's a pure java opensource project based on apache pdfbox.