how can I add textfield to the existing pdf template - java

I want create a pdf template from a another template,the result pdf is still the template then i can fill it with data。
I try to use PdfStamper but the result pdf is not template,any one can help me,thanks.

Let's distinguish two situations, depending on the nature of your PDF template:
You are talking about an XFA template:
In this case, the PDF is merely a container for an XML stream that defines your form. The only way to change it, is by editing the XML. This is best done manually using Adobe LiveCycle Designer, but if you really want to do it programmatically, you can extract the XML from the PDF using iText, manipulate the XML using any type of XML editing software, and finally put back the XML into the PDF using iText. The programmatical solution is very difficult as it requires you to be familiar with the XFA syntax and the specs for XFA consist of several hundreds of pages.
You are talking about an AcroForm template
In this case, the root dictionary has an /AcroForm dictionary of which one of the entries is a /Fields array that isn't empty. You can create a PdfReader instance for this template and pass the reader object to PdfStamper. You then create the extra fields you need (text fields, button fields,...) and add them to the stamper using the addAnnotation() method.
This is shown in the SubscribeForm example. We have an existing template subscribe.pdf and we add several buttons to it, resulting in the new template subscribe_me.pdf.
If this doesn't answer your question, please clarify, as it's generally not accepted to limit your question to saying "I try to use PdfStamper but the result pdf is not template", you should at least show what you've tried, otherwise you risk that your question will be closed.

Related

Is there any way to convert html file to IN-MEMORY File as PDF in Java?

I have been given an HTML file and wanted to convert it into an in-memory PDF file. During the conversion, I don't want to use any external location for this. All I want is to keep it in-memory.
So far, I have already tried some Java libraries for the conversion but all of them always create a temporary file in a location and then read/write from it. I don't want any I/O operation during the conversion.
The HTMLWorker class was deprecated many years ago. The goal of HTMLWorker was to convert small, simple HTML snippets to iText objects. It was never meant to convert complete HTML pages to PDF, yet that was how many developers tried to use it. This caused plenty of frustration because HTMLWorker didn't support every HTML tag, didn't parse CSS files, and so on. To avoid this frustration, HTMLWorker was removed from recent versions of iText.
In 2011, iText Group released XML Worker as a generic XML to PDF tool, built on top of iText 5. A default implementation converted XHTML (data) and CSS (styles) to PDF, mapping HTML tags such as
<p>
,
<img>
, and
<li>
to iText 5 objects such as Paragraph, Image, and ListItem. We don't know of any implementations that used XML Worker for any other XML formats, but many developers used XML Worker in combination with jsoup as an HTML2PDF converter.
XML Worker wasn't a URL2PDF tool though. XML Worker expected predictable HTML created for the sole purpose of converting that HTML to PDF. A common use case was the creation of invoices. Rather than programming the design of an invoice in Java or C#, developers chose to create a simple HTML template defining the structure of the document, and some CSS defining the styles. They then populated the HTML with data, and used XML Worker to create the invoices as PDF documents, throwing away the original HTML. We'll take a closer look at this use case in chapter 4, converting XML to HTML in memory using XSLT, then converting that HTML to PDF using the pdfHTML add-on.
When iText 5 was originally created, it was designed as a tool to produce PDF as fast as possible, flushing pages to the OutputStream as soon as they were finished. Several design choices that made perfect sense when iText was first released in the year 2000, were still present in iText 5 sixteen years later. Unfortunately, some of these choices made it very difficult –if not impossible– to extend the functionality of XML Worker to the level of quality many developers expected. If we really wanted to create a great HTML to PDF converter, we would have to rewrite iText from scratch. Which we did.
In 2016, we released iText 7, a brand new version of iText that was no longer compatible with previous versions, but that was created with pdfHTML in mind. A lot of work was spent on the new Renderer framework. When a document is created with iText 7, a tree of renderers and their child-renderers is built. The layout is created by traversing that tree, an approach that is much better suited when dealing with HTML to PDF conversion. The iText objects were completely redesigned to better match HTML tags and to allow setting styles "the CSS way."
For instance: in iText 5, you had a PdfPTable and a PdfPCell object to create a table and its cells. If you wanted every cell to contain text in a font different from the default font, you needed to set that font for the content of every separate cell. In iText 7, you have a Table and Cell object, and when you set a different font for the complete table, this font is inherited as the default font for every cell. That was a major step forward in terms of architectural design, especially if the goal is to convert HTML to PDF.
But let's not dwell on the past, let's see what pdfHTML can do for us. In the first chapter, we'll take a look at different variations of the convertToPdf()/ConvertToPdf() method, and we'll discover how the converter is configured.
This is the solution for generating HTML to pdf that works for me:
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
ITextRenderer renderer = new ITextRenderer();
renderer.setDocumentFromString(html);
renderer.layout();
renderer.createPDF(outputStream);
outputStream.close();
MimeBodyPart att = new MimeBodyPart();
ByteArrayDataSource bds = new ByteArrayDataSource(outputStream.toByteArray(), "application/pdf");
att.setDataHandler(new DataHandler(bds));
att.setFileName("example.pdf");
}

Make existing PDF's in to templets - iText

I am trying to make some existing PDF's into templets.
Because these documents hold real data I am replaceing this data such as names and addrsss and making them into dummy place holders.
Examples
[[Name]]
[[Address1]]
When I alter the text via the iText version 5 library replace via a program I can use the template.
To speed things up I tried to use Adobe DC.
When using this method the template stops working.
Any ideas?
From what I understand of your question;
you have (or want to have) a template document
fill in the template with data from a program
turn this back into a pdf
You can easily achieve some of your goals with iText.
I suggest you look into http://developers.itextpdf.com/examples/form-examples/clone-filling-out-forms

Re-write existing pdf via iText

is here a possibility to make given pdf-file blank and re-write new data to this file? I know that it is possible to trim document deleting pages from the middle. But I didn't find any ways to clear document at all. Thank you
I agree with #Samuel Huylebroeck, if you are looking to create new content then just create new pages or a new document.
If you really want to though, you should be able to remove the existing content of a page in a PDF by going through some of the lower level APIs that deal with things like Content Streams (Content Streams are not specific to iText so if you are looking to learn more about PDF in general you can read about these anywhere).
I don't know whether iText will allow you to set a pages content stream to null though or the content streams data to null, it would be quick to try though if you are really committed to this approach for whatever you are trying to achieve.

Replacing placeholders using iText in Java

I have a PDF that contains placeholders like <%DATE_OF_BIRTH%>, i want to be able to read in the PDF and change the PDF placeholder values to text using iText.
So read in PDF, use maybe a replaceString() method and change the placeholders then generate the new PDF.
Is this possible?
Thanks.
The use of placeholders in PDF is very, very limited. Theoretically it can be done and there are some instances where it would be feasible to do what you say, but because PDF doesn't know about structure very much, it's hard:
simply extracting words is difficult so recognising your placeholders in the PDF would already be difficult in many cases.
Replacing text in PDF is a nightmare because PDF files generally don't have a concept of words, lines and paragraphs. Hence no nice reflow of text for example.
Like I said, it could theoretically work under special conditions, but it's not a very good solution.
What would be a better approach depends on your use case:
1) For some forms it may be acceptable to have the complete form as a background image or PDF file and then generate your text as an overlay to that background (filling in the blanks so to speak) As pointed out by Bruno and mlk in comments, in this case you can also look into using form fields which can be dynamically filled.
2) For other forms it may be better to have your template in a structured format such as XML or HTML, do the text replacement in that format and then convert it into PDF.

convert pdf to xml

i want to convert a PDF file having few images into xml using java.
Is there any api though which it can be done so that all the images and text of pdf will be converted into xml file.
please help.
Use pdftohtml.
It can be installed with brew install pdftohtml. This adds pdftohtml to your path.
So, to convert pdf to xml, you can run pdftohtml -xml your_file.pdf your_file.xml
Then, just use java or any other language to execute this command.
PDF is one of the worst format to work with. It is designed for rendering 2D graphics and text documents. There are libraries which allow you to manipulate PDF objects in PDF document but it will not be able to tell you whether an image is related to which paragraph. You will not be able to extract the semantic of it easily.
On the other hand, XML is desinged to store text data in a well structured manner. This means it contains implicit semantic. In order to convert from a format which does not have semantic to a format which have implicit you will need to add your own logic into the conversion process otherwise you will just end up having a mess in your XML which contradicts the whole purpose of using XML.
Since each PDF document is very much different, it is almost impossible to automate this without human aids.
If you are really determine to do it, I suggest you use a library to read PDF into objects, and start writing a converter from there. You will have to take care of newpage, newline, page number, headers, images, graphics, tables, and many more by yourself. Since XML is made mainly for text data, you will have to deal with graphics somehow if you want to store in XML, e.g. convert graphics into Base64 string.
iText is a library that allows you to create and manipulate PDF documents. It enables developers looking to enhance web- and other applications with dynamic PDF document generation and/or manipulation.
Developers can use iText to:
* Serve PDF to a browser
* Generate dynamic documents from XML files or databases
* Use PDF's many interactive features
* Add bookmarks, page numbers, watermarks, etc.
* Split, concatenate, and manipulate PDF pages
* Automate filling out of PDF forms
* Add digital signatures to a PDF file
iText is available in Java as well as in C#.
You could Base64 encode the entire PDF file's byte stream and serialize it into an XML document like "<pdf><![CDATA[BASE64ENCODEDPDFFILECONTENTS...]]></pdf>". =)

Categories

Resources