PDF issue with fonts in adobe, after using overlays in PDFBox

PDF issue with fonts in adobe, after using overlays in PDFBox - java

We use pdfbox for in one of our applications.
Some pdfs that are overlaid result in "broken" output and fonts.
Below is the sample code I'm using to overlay pdfs.
The pdfs sometimes have different numbers of pages.
We flatten acroforms and set annotations to read-only.
Pdf page rotation and bbox sizing sometimes set differently (especially from scanners) so we try to correct for this.
PDDocument baseDocument = PDDocument.load(new File("base.pdf"));
PDDocument overlayDocument = PDDocument.load(new File("overlay.pdf"));
Iterator<PDPage> baseDocumentIterator = baseDocument.getPages().iterator();
Iterator<PDPage> overlayIterator = overlayDocument.getPages().iterator();
PDDocument finalOverlayDoc = new PDDocument();
while(baseDocumentIterator.hasNext() && overlayIterator.hasNext()) {
PDPage backing = baseDocumentIterator.next();
//locking annotations per page
List<PDAnnotation> annotations = backing.getAnnotations();
for (PDAnnotation a :annotations) {
a.setLocked(true);
a.setReadOnly(true);
}
// setting size so there's no weird overflow issues
PDRectangle rect = new PDRectangle();
rect.setLowerLeftX(0);
rect.setLowerLeftY(0);
rect.setUpperRightX(backing.getBBox().getWidth());
rect.setUpperRightY(backing.getBBox().getHeight());
backing.setCropBox(rect);
backing.setMediaBox(rect);
backing.setBleedBox(rect);
PDPage pg = overlayIterator.next();
//setting rotation if different. Some scanners cause issues.
if(backing.getRotation()!= pg.getRotation())
{
pg.setRotation(-backing.getRotation());
}
finalOverlayDoc.addPage(pg);
}
finalOverlayDoc.close();
//flatten acroform
PDAcroForm acroForm = baseDocument.getDocumentCatalog().getAcroForm();
if (acroForm != null) {
acroForm.flatten();
acroForm.setNeedAppearances(false);
}
Overlay overlay = new Overlay();
overlay.setOverlayPosition(Overlay.Position.FOREGROUND);
overlay.setInputPDF(baseDocument);
overlay.setAllPagesOverlayPDF(finalOverlayDoc);
Map<Integer, String> ovmap = new HashMap<Integer, String>();
overlay.overlay(ovmap);
PDPageTree allOverlayPages = overlayDocument.getPages();
if(baseDocument.getPages().getCount() < overlayDocument.getPages().getCount()) //Additional pages in the overlay pdf need to be appended to the base pdf.
{
for(int i=baseDocument.getPages().getCount();i<allOverlayPages.getCount(); i++)
{
baseDocument.addPage(allOverlayPages.get(i));
}
}
PDDocument finalDocument = new PDDocument();
for(PDPage p: baseDocument.getPages()){
finalDocument.addPage(p);
}
String filename = "examples/merge_pdf_examples/debug.pdf";
filename = filename + new Date().getTime() + ".pdf";
finalDocument.save(filename);
finalDocument.close();
baseDocument.close();
overlayDocument.close();

There is no error in the PDF file you shared relevant for using Overlay.
It uses one PDF feature which is seldom used, though, the pages inherit resources from their parent node: Page objects in a PDF are arranged in a tree with the actual pages being leaves; a page object in this tree often itself carries all the information defining it but a number of page properties can also be carried by an inner node and inherited by descendant pages unless they override them.
After you shared your code it turns out that you have a preparation step which loses all inherited information: When you generate finalOverlayDoc from overlayDocument you essentially do:
while(overlayIterator.hasNext()) {
PDPage pg = overlayIterator.next();
//setting rotation if different. Some scanners cause issues.
finalOverlayDoc.addPage(pg);
}
(OverlayDocuments test testOverlayPreparationExampleBroken)
Here you only transport the page object itself, losing all inherited properties.
For the document at hand you can fix this by explicitly setting the page resources to the inherited ones:
while(overlayIterator.hasNext()) {
PDPage pg = overlayIterator.next();
pg.setResources(pg.getResources());
//setting rotation if different. Some scanners cause issues.
finalOverlayDoc.addPage(pg);
}
(OverlayDocuments test testOverlayPreparationFixedExampleBroken)
Beware, though: This only explicitly sets the page resources but there also are other page attributes which can be inherited.
I would propose, therefore, that you don't create a new PDDocument at all; instead of moving the overlayDocument pages to finalOverlayDoc only change them in place. If overlayDocument has more pages than baseDocument, you additionally have to remove excess pages from overlayDocument. Then use overlayDocument in overlaying instead of finalOverlayDoc.
Looking further down your code I see you repeat the anti-pattern of moving page objects to other documents without respecting inherited properties again and again. I guess you should completely overhaul that code, removing that anti-pattern.

Related

Replace itext to pdfbox performance

I am evaluating to replace our pdf processing from itext to pdfbox. I did some tests with 200 pdfs with a single page (94KB, 469KB, 937KB) and merged them to one pdf in our application. PDFBox version: 2.0.23.
itext version: 2.1.7. Here are the test results:
Here is the itext implementation:
byte[] l_PDFPage = null;
PdfReader l_PDFReader = null;
PdfCopy l_Copier = null;
Document l_PDFDocument = null;
OutputStream l_Stream = new FileOutputStream(m_File);
// do it for all pages in the editor
for( int i = 0; i < m_Editor.getCountOfElements(); i++ ) {
l_Page = m_Editor.getPageAt(i);
l_PDFPage = l_Page.getAsPdf();
l_PDFReader = new PdfReader(l_PDFPage);
l_PDFReader.getPageN(1).put(PdfName.ROTATE, new PdfNumber(l_PDFReader.getPageRotation(1) + l_Page.getRotation() % 360));
l_PDFReader.consolidateNamedDestinations();
if( i == 0 ) {
l_PDFDocument = new Document(l_PDFReader.getPageSizeWithRotation(1));
l_Copier = new PdfCopy(l_PDFDocument, l_Stream);
l_PDFDocument.open();
}
l_Copier.addPage(l_Copier.getImportedPage(l_PDFReader, 1));
if( l_PDFReader.getAcroForm() != null )
l_Copier.copyAcroForm(l_PDFReader);
l_Copier.flush();
l_Copier.freeReader(l_PDFReader);
}
l_PDFDocument.close();
l_Stream.close();
Here is the pdfbox implementation:
byte[] l_PDFPage = null;
List<PDDocument> pageDocuments = new ArrayList<>();
PDDocument saveDocument = new PDDocument();
try {
// do it for all pages in the editor
for( int i = 0; i < m_Editor.getCountOfElements(); i++ ) {
// our wrapper object for a page
l_Page = m_Editor.getPageAt(i);
// page as byte[]
l_PDFPage = l_Page.getAsPdf();
PDDocument document = PDDocument.load(l_PDFPage);
// save page document to close it later
pageDocuments.add(document);
PDPage page = document.getPage(0);
saveDocument.addPage(saveDocument.importPage(page));
}
saveDocument.save(l_Stream);
}
finally {
// close every page document
for(PDDocument doc : pageDocuments) {
doc.close();
}
saveDocument.close();
}
I have also tried using pdfmerger of pdfbox. The performance was nearly the same as the other pdfbox implementation. But with the 937KB files I run in an outofmemory exception with this implementation:
byte[] l_PDFPage = null;
OutputStream l_Stream = new FileOutputStream(m_File);
PDFMergerUtility merger = new PDFMergerUtility();
// do it for all pages in the editor
for( int i = 0; i < m_Editor.getCountOfElements(); i++ ) {
l_Page = m_Editor.getPageAt(i);
// page as byte[]
l_PDFPage = l_Page.getAsPdf();
merger.addSource(new ByteArrayInputStream(l_PDFPage));
}
merger.setDestinationStream(l_Stream);
merger.mergeDocuments(null);
So my questions:
Why is the performance (needed time AND memory usage) of pdfbox so bad in comparison to itext?
Am I missing something in our pdfbox implementation?
Why I can't close the "page document" after I added the page in "saveDocument"? If i close it there I'd get an error while saving so I have to store the "page documents" and close them at the end.

PDFBox and iText are architecturally different and, therefore, perform differently well for different tasks.
In particular iText attempts to write out new contents early, in your case much of the page is written to the output already during
l_Copier.addPage(l_Copier.getImportedPage(l_PDFReader, 1));
and
l_PDFDocument.close();
eventually only finalizes the PDF and writes last remaining objects and the file trailer.
PDFBox on the other hand saves everything in the end at once:
saveDocument.save(l_Stream);
The approach of iText has the advantage of a smaller memory footprint (as you observed) and the disadvantage that you cannot change data of a page once it is written.
(As an aside: the iText architecture has changed from iText 5 to iText 7, in iText 7 you have the choice and can keep everything in memory, but the price here also is a big memory footprint.)
Thus,
Why is the performance (needed time AND memory usage) of pdfbox so bad in comparison to itext?
The difference in memory use can partially be explained by the above. Also in iText after
l_Copier.freeReader(l_PDFReader);
the PdfReader can be closed (which you leave to the garbage collection to do for you) to free its resources while in your PDFBox code you keep all the source documents open, holding the resources up to the end. (Actually I would have assumed that when you're using importPage, you needn't keep them.)
Concerning the time I'm not sure now. You should do some finer clocking and determine where exactly the extra time is used in PDFBox; thus, I second #Tilman's request for profiling data. I assume it's during the final save but that's only a hunch. Also such time differences might depend on structural details of the PDF in question and may be less extreme for other documents.

Problem drawing some 16bit transparent images with PDPageContentStream.drawImage after updating PDFBox from Version 2.08 to 2.12/2.16

I've some very basic code which inserts an image into an existing PDF:
public class InsertImg
{
public static void main (final String[] args) throws IOException
{
PDDocument document = PDDocument.load (new File ("original.pdf"));
PDPage page = document.getPage (0);
byte[] imgBytes = Files.readAllBytes (Paths.get ("signature.png"));
PDImageXObject pdImage = PDImageXObject.createFromByteArray (document, imgBytes, "name_of_image");
PDPageContentStream content = new PDPageContentStream (document, page, AppendMode.APPEND, true, true);
content.drawImage (pdImage, 50.0f, 350.0f, 100.0f, 25.0f);
content.close ();
document.save (new File ("result.pdf"));
document.close ();
}
}
While this code worked fine in PdfBox 2.08 for all image files, it works under version 2.012 only for some images and does not work anymore for all image files.
(Background: We would like to insert an image of a signature into an existing and already generated letter. The signatures are all generated with the same software. In version 2.12 not all signatures can be inserted anymore. In version 2.08 all signature could be inserted).
The generated pdf-file "result.pdf" cannot be opened in Acrobat Reader. Acrobat Reader shows only the original pdf "original.pdf", but does not display the signature-image. It says "error in page. please contact the creator of the pdf".
However, most images can be inserted, so it is likely that the problem depends on the very image used.
The images are all ok, they are png's and where checked and verified with various imaging programs, e.g. gimp or irfanview.
Furthermore, the code above has always worked fine with PdfBox 2.08. After an update of PdfBox to version 2.12, the problem showed up and also the newest version 2.16 still produces the error. Still on the same image files, and still not on all.
NB: When I put the following line into comment, then no error shows up in Acrobat Reader, so the problem must be somewhere within drawImage.
// content.drawImage (pdImage, 50.0f, 350.0f, 100.0f, 25.0f);
and the rest of the code seems to be fine.
Also, I've just tried starting with an empty PDF and not loading an already generated one.
PDDocument document = new PDDocument ();
PDPage page = new PDPage ();
document.addPage (page);
[...]
The problem here is still the same, so the issue does not depend on the underlying PDF.

It is a bug since 2.0.12 (wrong alternate colorspace for gray images created with the LosslessFactory) that has been fixed in PDFBOX-4607 and will be in release 2.0.17. Display works for all viewers I have tested except Adobe Reader, despite that the alternate colorspace shouldn't be used when an ICC colorspace is available. Here's some code to fix PDFs (this assumes that images are only on top level of a page, i.e. images in other structures are not considered)
for (PDPage page : doc.getPages())
{
PDResources resources = page.getResources();
if (resources == null)
{
continue;
}
for (COSName name : resources.getXObjectNames())
{
PDXObject xObject = resources.getXObject(name);
if (xObject instanceof PDImageXObject)
{
PDImageXObject img = (PDImageXObject) xObject;
if (img.getColorSpace() instanceof PDICCBased)
{
PDICCBased icc = (PDICCBased) img.getColorSpace();
if (icc.getNumberOfComponents() == 1 && PDDeviceRGB.INSTANCE.equals(icc.getAlternateColorSpace()))
{
List<PDColorSpace> list = new ArrayList<>();
list.add(PDDeviceGray.INSTANCE);
icc.setAlternateColorSpaces(list);
}
}
}
}
}

PDFBox Form fill - saveIncremental does not work

I have a pdf file with some form field that I want to fill from java. Right now I'm trying to fill just one form which I am finding by its name. My code looks like this:
File file = new File("c:/Testy/luxmed/Skierowanie3.pdf");
PDDocument document = PDDocument.load(file);
PDDocumentCatalog doc = document.getDocumentCatalog();
PDAcroForm Form = doc.getAcroForm();
String formName = "topmostSubform[0].Page1[0].pana_pania[0]";
PDField f = Form.getField(formName);
setField(document, formName, "Artur");
System.out.println("New value 2nd: " + f.getValueAsString());
document.saveIncremental(new FileOutputStream("c:/Testy/luxmed/nowy_pd3.pdf"));
document.close();
and this:
public static void setField(PDDocument pdfDocument, String name, String Value) throws IOException
{
PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();
PDField field = acroForm.getField(name);
if (field instanceof PDCheckBox){
field.setValue("Yes");
}
else if (field instanceof PDTextField){
System.out.println("Original value: " + field.getValueAsString());
field.setValue(Value);
System.out.println("New value: " + field.getValueAsString());
}
else{
System.out.println("Nie znaleziono pola");
}
}
As system.out states, the value was set correctly, but in new the generated pdf file, new value is not showing up (original String is presented) so I guess the incremental saving does not work properly. What am I missing?
I use 2.0.2 version of pdfbox, and here is pdf file with which I working: pdf

In general
When saving changes to a PDF as an incremental update with PDFBox 2.0.x, you have to set the property NeedToBeUpdated to true for every PDF object changed. Furthermore, the object must be reachable from the PDF catalog via a chain of references, and each PDF object in this chain also has to have the property NeedToBeUpdated set to true.
This is due to the way PDFBox saves incrementally, starting from the catalog it inspects the NeedToBeUpdated property, and if it is set to true, PDFBox stores the object, and only in this case it recurses deeper into the objects referenced from this object in search for more objects to store.
In particular this implies that some objects unnecessarily have to be marked NeedToBeUpdated, e.g. the PDF catalog itself, and in some cases this even defeats the purpose of the incremental update at large, see below.
In case of the OP's document
Setting the NeedToBeUpdated properties
On one hand one has to extend the setField method to mark the chain of field dictionaries up to and including the changed field and also the appearance:
public static void setField(PDDocument pdfDocument, String name, String Value) throws IOException
{
PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();
PDField field = acroForm.getField(name);
if (field instanceof PDCheckBox) {
field.setValue("Yes");
}
else if (field instanceof PDTextField) {
System.out.println("Original value: " + field.getValueAsString());
field.setValue(Value);
System.out.println("New value: " + field.getValueAsString());
}
else {
System.out.println("Nie znaleziono pola");
}
// vvv--- new
COSDictionary fieldDictionary = field.getCOSObject();
COSDictionary dictionary = (COSDictionary) fieldDictionary.getDictionaryObject(COSName.AP);
dictionary.setNeedToBeUpdated(true);
COSStream stream = (COSStream) dictionary.getDictionaryObject(COSName.N);
stream.setNeedToBeUpdated(true);
while (fieldDictionary != null)
{
fieldDictionary.setNeedToBeUpdated(true);
fieldDictionary = (COSDictionary) fieldDictionary.getDictionaryObject(COSName.PARENT);
}
// ^^^--- new
}
(FillInFormSaveIncremental method setField)
On the other hand the main code has to be extended to mark a chain from the catalog to the fields array:
PDDocument document = PDDocument.load(...);
PDDocumentCatalog doc = document.getDocumentCatalog();
PDAcroForm Form = doc.getAcroForm();
String formName = "topmostSubform[0].Page1[0].pana_pania[0]";
PDField f = Form.getField(formName);
setField(document, formName, "Artur");
System.out.println("New value 2nd: " + f.getValueAsString());
// vvv--- new
COSDictionary dictionary = document.getDocumentCatalog().getCOSObject();
dictionary.setNeedToBeUpdated(true);
dictionary = (COSDictionary) dictionary.getDictionaryObject(COSName.ACRO_FORM);
dictionary.setNeedToBeUpdated(true);
COSArray array = (COSArray) dictionary.getDictionaryObject(COSName.FIELDS);
array.setNeedToBeUpdated(true);
// ^^^--- new
document.saveIncremental(new FileOutputStream(...));
document.close();
(FillInFormSaveIncremental test testFillInSkierowanie3)
Beware: for use with generic PDFs one obviously should introduce some null tests...
Opening the result file in Adobe Reader one will unfortunately see that the program complains about changes which disable extended features in the file.
This is due to the quirk in PDFBox' incremental saving that it requires some unnecessary objects in the update section. In particular the catalog is saved there which contains a usage rights signature (the technology granting extended features). The re-saved signature obviously is not at its original position in its original revision anymore. Thus, is invalidated.
Most likely the OP OP wanted to save the PDF incrementally to not break this signature but PDFBox does not permit this. Oh well...
Thus, the only thing one can do is prevent the warning by completely removing the signature.
Removing the usage rights signature
We already have retrieved the catalog object in the additions above, so removing the signature is easy:
COSDictionary dictionary = document.getDocumentCatalog().getCOSObject();
// vvv--- new
dictionary.removeItem(COSName.PERMS);
// ^^^--- new
dictionary.setNeedToBeUpdated(true);
(FillInFormSaveIncremental test testFillInSkierowanie3)
Opening the result file in Adobe Reader one will unfortunately see that the program complains about missing extended features in the file to save it.
This is due to the fact that Adobe Reader requires extended features to save changes to XFA forms, extended features we had to remove in this step.
But the document at hand is a hybrid AcroForm & XFA form document, and Adobe Reader requires no extended features to save AcroForm documents. Thus, all we have to do is remove the XFA form. As our code only sets the AcroForm value, this is a good idea anyways...
Removing the XFA form
We already have retrieved the acroform object in the additions above, so removing the XFA form referenced from there is easy:
dictionary = (COSDictionary) dictionary.getDictionaryObject(COSName.ACRO_FORM);
// vvv--- new
dictionary.removeItem(COSName.XFA);
// ^^^--- new
dictionary.setNeedToBeUpdated(true);
(FillInFormSaveIncremental test testFillInSkierowanie3)
Opening the result file in Adobe Reader one will see that one now can without further ado edit the form and save the file.
Beware, a sufficiently new Adobe Reader version is required for this, earlier versions (up to at least version 9) did require extended features even for saving changes to an AcroForm form

Page is cropped in new document in PDFBox while copying

I am trying to split the single PDF into multiple. Like 10 page document into 10 single page document.
PDDocument source = PDDocument.load(input_file);
PDDocument output = new PDDocument();
PDPage page = source.getPages().get(0);
output.addPage(page);
output.save(file);
output.close();
Here the problem is, the new document's page size is different than original document. So some text are cropped or missing in new document. I am using PDFBox 2.0 and how I can avoid this?
UPDATE:
Thanks #mkl.
Splitter did the magic. Here is the updated working part,
public static void extractAndCreateDocument(SplitMeta meta, PDDocument source)
throws IOException {
File file = new File(meta.getFilename());
Splitter splitter = new Splitter();
splitter.setStartPage(meta.getStart());
splitter.setEndPage(meta.getEnd());
splitter.setSplitAtPage(meta.getEnd());
List<PDDocument> docs = splitter.split(source);
if(docs.size() > 0){
PDDocument output = docs.get(0);
output.save(file);
output.close();
}
}
public class SplitMeta {
private String filename;
private int start;
private int end;
public SplitMeta() {
}
}

Unfortunately the OP has not provided a sample document to reproduce the issue. Thus, I have to guess.
I assume that the issue is based in objects not immediately linked to the page object but inherited from its parents.
In that case using PDDocument.addPage is the wrong choice as this method only adds the given page object to the target document page tree without consideration of inherited stuff.
Instead one should use PDDocument.importPage which is documented as:
/**
* This will import and copy the contents from another location. Currently the content stream is stored in a scratch
* file. The scratch file is associated with the document. If you are adding a page to this document from another
* document and want to copy the contents to this document's scratch file then use this method otherwise just use
* the {#link #addPage} method.
*
* Unlike {#link #addPage}, this method does a deep copy. If your page has annotations, and if
* these link to pages not in the target document, then the target document might become huge.
* What you need to do is to delete page references of such annotations. See
* here for how to do this.
*
* #param page The page to import.
* #return The page that was imported.
*
* #throws IOException If there is an error copying the page.
*/
public PDPage importPage(PDPage page) throws IOException
Actually even this method might not suffice as is as it does not consider all inherited attributes, but looking at the Splitter utility class one gets an impression what one has to do:
PDPage imported = getDestinationDocument().importPage(page);
imported.setCropBox(page.getCropBox());
imported.setMediaBox(page.getMediaBox());
// only the resources of the page will be copied
imported.setResources(page.getResources());
imported.setRotation(page.getRotation());
// remove page links to avoid copying not needed resources
processAnnotations(imported);
making use of the helper method
private void processAnnotations(PDPage imported) throws IOException
{
List<PDAnnotation> annotations = imported.getAnnotations();
for (PDAnnotation annotation : annotations)
{
if (annotation instanceof PDAnnotationLink)
{
PDAnnotationLink link = (PDAnnotationLink)annotation;
PDDestination destination = link.getDestination();
if (destination == null && link.getAction() != null)
{
PDAction action = link.getAction();
if (action instanceof PDActionGoTo)
{
destination = ((PDActionGoTo)action).getDestination();
}
}
if (destination instanceof PDPageDestination)
{
// TODO preserve links to pages within the splitted result
((PDPageDestination) destination).setPage(null);
}
}
// TODO preserve links to pages within the splitted result
annotation.setPage(null);
}
}
As you are trying to split the single PDF into multiple, like 10 page document into 10 single page document, you might want to use this Splitter utility class as is.
Tests
To test those methods I used the output of the PDF Clown sample output AnnotationSample.Standard.pdf because that library heavily depends on inheritance of page tree values. Thus, I copied the content of its only page to a new document using either PDDocument.addPage, PDDocument.importPage, or Splitter like this:
PDDocument source = PDDocument.load(resource);
PDDocument output = new PDDocument();
PDPage page = source.getPages().get(0);
output.addPage(page);
output.save(new File(RESULT_FOLDER, "PageAddedFromAnnotationSample.Standard.pdf"));
output.close();
(CopyPages.java test testWithAddPage)
PDDocument source = PDDocument.load(resource);
PDDocument output = new PDDocument();
PDPage page = source.getPages().get(0);
output.importPage(page);
output.save(new File(RESULT_FOLDER, "PageImportedFromAnnotationSample.Standard.pdf"));
output.close();
(CopyPages.java test testWithImportPage)
PDDocument source = PDDocument.load(resource);
Splitter splitter = new Splitter();
List<PDDocument> results = splitter.split(source);
Assert.assertEquals("Expected exactly one result document from splitting a single page document.", 1, results.size());
PDDocument output = results.get(0);
output.save(new File(RESULT_FOLDER, "PageSplitFromAnnotationSample.Standard.pdf"));
output.close();
(CopyPages.java test testWithSplitter)
Only the final test copied the page faithfully.

iText Android - Adding text to existing PDF

we have a PDF with some fields in order to collect some data, and I have to fill it programmatically with iText on Android by adding some text in those positions. I've been thinking about different ways to achieve this, with little success in each one.
Note: I'm using the Android version of iText (iTextG 5.5.4) and a Samsung Galaxy Note 10.1 2014 (Android 4.4) for most of my tests.
The approach I took from the start was to "draw" the text on a given coordinates, for a given page. This has some problems with the management of the fields (I have to be aware of the length of the strings, and it could be hard to position each text in the exact coordinate of the pdf). But most importantly, the performance of the process is really slow in some devices/OSVersions (it works great in Nexus 5 with 5.0.2, but takes several minutes with a 5MB Pdf on the Note 10.1).
pdfReader = new PdfReader(is);
document = new Document();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
pdfCopy = new PdfCopy(document, baos);
document.open();
PdfImportedPage page;
PdfCopy.PageStamp stamp;
for (int i = 1; i <= pdfReader.getNumberOfPages(); i++) {
page = pdfCopy.getImportedPage(pdfReader, i); // First page = 1
stamp = pdfCopy.createPageStamp(page);
for (int i=0; i<10; i++) {
int posX = i*50;
int posY = i*100;
Phrase phrase = new Phrase("Example text", FontFactory.getFont(FontFactory.HELVETICA, 12, BaseColor.RED));
ColumnText.showTextAligned(stamp.getOverContent(), Element.ALIGN_CENTER, phrase, posX, posY, 0);
}
stamp.alterContents();
pdfCopy.addPage(page);
}
We though about adding "forms fields" instead of drawing. That way I can configure a TextField and avoid managing the texts myself. However, the final PDF shouldn't have any annotations, so I would need to copy it into a new Pdf without annotations and with those "forms fields" drawn. I don't have an example of this because I wasn't able to perform this, I don't even know if this is possible/worthwhile.
The third option would be to receive a Pdf with the "forms fields" already added, that way I only have to fill them. However I still need to create a new Pdf with all those fields and without annotations...
I'd like to know what's be the best way in performance to do this process, and any help about achieving it. I am really newbie with iText and any help would be really appreciated.
Thanks!
EDIT
At the end I used the third option: a PDF with editable fields that we fill, and then we use the "flattening" to create a non-editable PDF with all texts already there.
The code is as follows:
pdfReader = new PdfReader(is);
FileOutputStream fios = new FileOutputStream(outPdf);
PdfStamper pdfStamper = new PdfStamper(pdfReader, fios);
//Filling the PDF (It's totally necessary that the PDF has Form fields)
fillPDF(pdfStamper);
//Setting the PDF to uneditable format
pdfStamper.setFormFlattening(true);
pdfStamper.close();
and the method to fill the forms:
public static void fillPDF(PdfStamper stamper) throws IOException, DocumentException{
//Getting the Form fields from the PDF
AcroFields form = stamper.getAcroFields();
Set<String> fields = form.getFields().keySet();
for(String field : fields){
form.setField("name", "Ernesto");
form.setField("surname", "Lage");
}
}
}
The only thing about this approach is that you need to know the name of each field in order to fill it.

There is a process in iText known as 'flattening', which takes the form fields, and replaces them with the text that the fields contain.
I haven't used iText in a few years (and not at all on Android), but if you search the manual or online examples for 'flattening', you should find how to do it.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

PDF issue with fonts in adobe, after using overlays in PDFBox - java

Related

Replace itext to pdfbox performance

Problem drawing some 16bit transparent images with PDPageContentStream.drawImage after updating PDFBox from Version 2.08 to 2.12/2.16

PDFBox Form fill - saveIncremental does not work

Page is cropped in new document in PDFBox while copying

iText Android - Adding text to existing PDF

Categories

Resources