This is not a duplicate question. I had searched and tried many options before posting this question.
We have a web page, in which user should be able to input data in text boxes, text areas, images and also Rich Text editors. This data has to be filled in an existing report, like filling the blanks.
I was able to achieve the functionality using Apache FOP when the user input is simple text. But Apache FOP doesn't work if the user input is Rich Text(html format). FOP will not render html, and it just pushes the html code(ex: <strong> XYZ /strong>) into the pdf.
I tried using iText, but the setback here is that even though iText supports rendering of html to pdf, it is not able to place the images, that are included in <img> tags, in the pdf file.
I can try to create a pdf using iText api block by block, but the problem is rich text data entered by the user can not be embedded between the code since building pdf block by block and html to pdf can not be done together in iText. Or at least that is what I think from my experience.
Is there any other way to create a pdf file from java with images, rich text rendering as it is, headers and footers?
iText provides the capability to convert HTML Data to Pdf. Below is the snippet to do it :
Lets assume the html data is available as Input Stream (If its a String then we can convert it to InputStream using Apache Commons - IOUtils)
InputStream htmlData; // Html Data that needs to converted to Pdf
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
Document document = new Document();
PdfWriter pdfWriter = PdfWriter.getInstance(document, outputStream);
document.open();
// convert the HTML with the built-in convenience method
XMLWorkerHelper.getInstance().parseXHtml(pdfWriter, document, htmlData);
document.close();
// outputStream now has the required pdf data
I am working as Social Media Developer for Aspose and to add rich text to a form field in PDF file, you can try our Aspose.Pdf for Java API. Check the following sample code:
// Open a PDF document
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document("c:\\data\\input.pdf");
//Find Rich TextBox field using Field Name
RichTextBoxField textBoxField1 = (RichTextBoxField)pdfDocument.getForm().get("textbox1");
//Set the field value
textBoxField1.setValue("<strong> XYZ </strong>");
// Save the modified PDF
pdfDocument.save("c:\\data\\output2.pdf");
I am not trying to market or promote this product. This api actually solved our problem so thought of mentioning it as it might help fellow developers. please let me know if this is against your policy.
I finally realized that the solution for my requirement can not be achieved with either FOP, iText, Aspose, Flying Saucer, JODConverter.
I found a paid api Sferyx. This api allows to render a very complex html to pdf almost preserving the original style. It also renders the images included in the html. We are still exploring this api and will post what other features this api provides.
Related
How in a java project can a HTML form upon submission be converted to PDF and then attached to a email.
Springboot & Thymeleaf are the frameworks in use. The form looks like this:
http://jsfiddle.net/x1hphsvb/5563/
Controller so far:
#org.springframework.stereotype.Controller
#EnableAutoConfiguration
public class Controller {
#RequestMapping("/")
String home() {
return "static/index.html";
}
public static void main(String[] args) throws Exception {
SpringApplication.run(Controller.class, args);
}
}
I have looked at this tutorial and searched for a way to do it with PDF Box without success.
Should I take the data in the back end and insert it into a HTML template or insert the data into a PDF template.
The PDF form should also have the collapsability similar to the HTML.
Regarding the conversion of HTML to PDF
The example you refer to has my name in it (a reference to a package name starting with com.lowagie) which means it's about iText, not about pdfBox. PdfBox doesn't convert HTML to PDF, so that's not an option.
Versions of iText with my name in it, predate iText 5 and should no longer be used in a commercial context. See Can iText 2.1.7 / iTextSharp 4.1.6 or earlier be used commercially?
You also use the tag Flying Saucer. Flying Saucer is a third-party tool to convert HTML to PDF that was built on top of such an old version of iText.
Tips:
If you want to convert HTML to PDF, I suggest that you read Converting HTML to PDF using iText
If you want to use a templating format based on HTML, I suggest that you read How to create template and generate pdf using template and database data iText C#
Regarding PDF forms
You wrote: "The PDF form should also have the collapsability similar to the HTML."
Please check ISO 32000-2 (the PDF 2.0 standard) and you'll discover that PDF forms can't collapse the same way HTML forms collapse. You may have seen PDF documents with similar functionality, but those forms weren't ISO 32000-2 documents; they were XFA forms. XFA stands for the XML Forms Architecture, and that technology was deprecated. You'll hardly find any viewers other than Adobe Reader that support such forms.
When it comes to data entry, PDF has lost and HTML 5 has won. If you've read the answer to the question How to create template and generate pdf using template and database data iText C#, you've noticed that the DITO product chose to create HTML 5 templates for data entry and PDF templates for data presentation.
I'm trying to flatten the XFA PDF using iText pdfxfa library. On flattening the pdf using the demo application provided by iText, I get all the data correctly embedded in my pdf. But when I try to do it using my code, it is otherwise. The data for the text fields, checkboxes gets correctly embedded, but for attachment names. By 'attachments' I'm referring to: The dynamic form can contain another PDF(attachment) inside it. The 'attachment' can be added to the PDF using buttons provided in the XFA pdf. Below is the code I'm using to flatten the PDF. I've copied the XFA of the PDF using iText RUPS in a separate file and used it as InputStream to XFA flattenXDP().
private void flattenXFA(String flattenedPDFDest) throws FileNotFoundException, IOException, InterruptedException {
FileOutputStream fos = new FileOutputStream(flattenedPDFDest);
XFAFlattener xfaf = new XFAFlattener();
// The XFA for the PDF is copied from iText RUPS in the phshuman10.xfa.xml file.
xfaf.flattenXDP(new FileInputStream("/home/NetBeansProjects/kitext/resources/phshuman10.xfa.xml"), fos);
fos.close();
}
Link to the zip of all required PDF's:
https://drive.google.com/file/d/0B6w278NcMSCrT2p6cWQxZG0yYVU/view?usp=sharing
The name of PDF in the zip:
Flattened PDF using itext demo: checkResult.pdf
Sample filled copy of form: PHSHumanSubjectsAndClinicalTrialsInfo-V1.0 (10).pdf
Flattened PDF using my code: tt_flattened3.pdf
The XFA file for PHSHumanSubjectsAndClinicalTrialsInfo-V1.0 (10).pdf: phshuman10.xfa.xml
If required, my scenario can be adequately reproduced using the uploaded resources! Thanks in advance.
This is expected behavior. If you open your original XFA form as a PDF file in a PDF Viewer, you will be able to see that there are 4 attachments in this PDF file. XFA itself is an XML-based format which can be embedded into PDF, and it can actually interconnect with PDF file by some JavaScript APIs.
What happens is that in your form the JavaScript code in your XFA form communicates with PDF file (most likely by proprietary Acrobat's API), and is able to retrieve the attachments.
When you try to flatten pure XDP package, you only extract the XML from PDF that is responsible for definition of XFA form, some datasets etc, but do not extract anything related to PDF file itself: fonts, images, attachments.
In case XFA form uses some PDF resources, you will not be able to flatten them 100% correctly as in original form contained in PDF.
Thus, if PDF resources are used in XFA form, you will have to flatten the PDF form directly via flatten(InputStream, OutputStream) method, which accepts the input stream for a PDF containing an XFA form, and output stream for the resultant flattened PDF file.
Is it possible to set a formatted HTML-Text (Color, Alignment, ...) from a HTMLEditor to an "editable" PDF using iText.
I didn't find anything on the internet.
Thanks.
The easiest way of doing this is (as Amedee suggested) using pdfHTML.
It's an iText7 add-on that converts HTML5 (+CSS3) into pdf syntax.
The code is pretty straightforward:
HtmlConverter.convertToPdf(
"<b>This text should be written in bold.</b>", // html to be converted
new PdfWriter(
new File("C://users/user2002/output.pdf") // destination file
)
);
To learn more, go to https://itextpdf.com/itext7/pdfHTML
I found a Solution in this post using The Flying Saucer: this
I have a webpage with a export option to PDF. I have to display the contents of the page in the PDF. Currently I use iText PDF Library to generate PDFs. The problem is creating PDF with iText is quite a challenge. Moreover we get frequent layout/UI changes for the webpage, so we have make the same changes to PDF.
Is there any way i can convert my JSP output to PDF. Like for example "if we set the content type to contentType="application/vnd.ms-excel", a JSP table can be rendered as Excel document.
Have you checked Jasper Reports ? It has the concept of XML templates. Also same template can be used to generate Word / XLS / PDF/ CSV / XML output.
You don't need to change the iText code generation if you use it in combination with Flying Saucer (a.k.a. XhtmlRenderer). It's then basically as simple as:
String inputPath = new File("/file.xhtml").toURI().toURL().toString();
OutputStream outputStream = new FileOutputStream("/file.pdf");
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(inputPath);
renderer.layout();
renderer.createPDF(outputStream);
outputStream.close();
You can find a blog with more code samples here.
You should check wkhtmltopdf.
Does anyone know if it is possible to convert a HTML page (url) to a PDF using iText?
If the answer is 'no' than that is OK as well since I will stop wasting my time trying to work it out and just spend some money on one of a number of components which I know can :)
I think this is exactly what you were looking for
http://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html
http://code.google.com/p/flying-saucer
Flying Saucer's primary purpose is to render spec-compliant XHTML and CSS 2.1 to the screen as a Swing component. Though it was originally intended for embedding markup into desktop applications (things like the iTunes Music Store), Flying Saucer has been extended work with iText as well. This makes it very easy to render XHTML to PDFs, as well as to images and to the screen. Flying Saucer requires Java 1.4 or higher.
I have ended up using ABCPdf from webSupergoo.
It works really well and for about $350 it has saved me hours and hours based on your comments above.
The easiest way of doing this is using pdfHTML.
It's an iText7 add-on that converts HTML5 (+CSS3) into pdf syntax.
The code is pretty straightforward:
HtmlConverter.convertToPdf(
"<b>This text should be written in bold.</b>", // html to be converted
new PdfWriter(
new File("C://users/mark/documents/output.pdf") // destination file
)
);
To learn more, go to http://itextpdf.com/itext7/pdfHTML
The answer to your question is actually two-fold. First of all you need to specify what you intend to do with the rendered HTML: save it to a new PDF file, or use it within another rendering context (i.e. add it to some other document you are generating).
The former is relatively easily accomplished using the Flying Saucer framework, which can be found here: https://github.com/flyingsaucerproject/flyingsaucer
The latter is actually a much more comprehensive problem that needs to be categorized further.
Using iText you won't be able to (trivially, at least) combine iText elements (i.e. Paragraph, Phrase, Chunk and so on) with the generated HTML. You can hack your way out of this by using the ContentByte's addTemplate method and generating the HTML to this template.
If you on the other hand want to stamp the generated HTML with something like watermarks, dates or the like, you can do this using iText.
So bottom line: You can't trivially integrate the rendered HTML in other pdf generating contexts, but you can render HTML directly to a blank PDF document.
Use itext libray:
Here is the sample code. It is working perfectly fine:
String htmlFilePath = filePath + ".html";
String pdfFilePath = filePath + ".pdf";
// create an html file on given file path
Writer unicodeFileWriter = new OutputStreamWriter(new FileOutputStream(htmlFilePath), "UTF-8");
unicodeFileWriter.write(document.toString());
unicodeFileWriter.close();
ConverterProperties properties = new ConverterProperties();
properties.setCharset("UTF-8");
if (url.contains(".kr") || url.contains(".tw") || url.contains(".cn") || url.contains(".jp")) {
properties.setFontProvider(new DefaultFontProvider(false, false, true));
}
// convert the html file to pdf file.
HtmlConverter.convertToPdf(new File(htmlFilePath), new File(pdfFilePath), properties);
Maven dependencies
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext7-core</artifactId>
<version>7.1.6</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>html2pdf</artifactId>
<version>2.1.3</version>
</dependency>
Use iText's HTMLWorker
Example
When I needed HTML to PDF conversion earlier this year, I tried the trial of Winnovative HTML to PDF converter (I think ExpertPDF is the same product, too). It worked great so we bought a license at that company. I don't go into it too in depth after that.