IText merge documents with acrofields - java

I currently have a PdfReader and a PdfStamper that I am filling out the acrofields with. I now have to copy another pdf to the end of that form I have been filling out and when I do I lose the acrofield on the new form I copy over. Here is the code.
public static void addSectionThirteenPdf(PdfStamper stamper, Rectangle pageSize, int pageIndex){
PdfReader reader = new PdfReader(FacesContext.getCurrentInstance().getExternalContext().getResourceAsStream("/resources/documents/Section13.pdf"));
AcroFields fields = reader.getAcroFields();
fields.renameField("SecurityGuidancePage3", "SecurityGuidancePage" + pageIndex);
stamper.insertPage(pageIndex, pageSize);
stamper.replacePage(reader, 1, pageIndex);
}
The way that I am creating the original document is like this.
OutputStream output = FacesContext.getCurrentInstance().getExternalContext().getResponseOutputStream();
PdfReader pdfTemplate = new PdfReader(FacesContext.getCurrentInstance().getExternalContext().getResourceAsStream("/resources/documents/dd254.pdf"));
PdfStamper stamper = new PdfStamper(pdfTemplate, output);
stamper.setFormFlattening(true);
AcroFields fields = stamper.getAcroFields();
Is there a way to merge using the first piece of code and merge both of the acrofields together?

Depending on what you want exactly, different scenarios are possible, but in any case: you are doing it wrong. You should use either PdfCopy or PdfSmartCopy to merge documents.
The different scenarios are explained in the following video tutorial.
You can find most of the examples in the iText sandbox.
Merging different forms (having different fields)
If you want to merge different forms without flattening them, you should use PdfCopy as is done in the MergeForms example:
public void createPdf(String filename, PdfReader[] readers) throws IOException, DocumentException {
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileOutputStream(filename));
copy.setMergeFields();
document.open();
for (PdfReader reader : readers) {
copy.addDocument(reader);
}
document.close();
for (PdfReader reader : readers) {
reader.close();
}
}
In this case, readers is an array of PdfReader instances containing different forms (with different field names), hence we use PdfCopy and we make sure that we don't forget to use the setMergeFields() method, or the fields won't be copied.
Merging identical forms (having identical fields)
In this case, we need to rename the fields, because we probably want different values on different pages. In PDF a field can only have a single value. If you merge identical forms, you have multiple visualizations of the same field, but each visualization will show the same value (because in reality, there is only one field).
Let's take a look at the MergeForms2 example:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
Document document = new Document();
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(dest));
copy.setMergeFields();
document.open();
List<PdfReader> readers = new ArrayList<PdfReader>();
for (int i = 0; i < 3; ) {
PdfReader reader = new PdfReader(renameFields(src, ++i));
readers.add(reader);
copy.addDocument(reader);
}
document.close();
for (PdfReader reader : readers) {
reader.close();
}
}
public byte[] renameFields(String src, int i) throws IOException, DocumentException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, baos);
AcroFields form = stamper.getAcroFields();
Set<String> keys = new HashSet<String>(form.getFields().keySet());
for (String key : keys) {
form.renameField(key, String.format("%s_%d", key, i));
}
stamper.close();
reader.close();
return baos.toByteArray();
}
As you can see, the renameFields() method creates a new document in memory. That document is merged with other documents using PdfSmartCopy. If you'd use PdfCopy here, your document would be bloated (as we'll soon find out).
Merging flattened forms
In the FillFlattenMerge1, we fill out the forms using PdfStamper. The result is a PDF file that is kept in memory and that is merged using PdfCopy. While this example is fine if you'd merge different forms, this is actually an example on how not to do it (as explained in the video tutorial).
The FillFlattenMerge2 shows how to merge identical forms that are filled out and flattened correctly:
public void manipulatePdf(String src, String dest) throws DocumentException, IOException {
Document document = new Document();
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(dest));
document.open();
ByteArrayOutputStream baos;
PdfReader reader;
PdfStamper stamper;
AcroFields fields;
StringTokenizer tokenizer;
BufferedReader br = new BufferedReader(new FileReader(DATA));
String line = br.readLine();
while ((line = br.readLine()) != null) {
// create a PDF in memory
baos = new ByteArrayOutputStream();
reader = new PdfReader(SRC);
stamper = new PdfStamper(reader, baos);
fields = stamper.getAcroFields();
tokenizer = new StringTokenizer(line, ";");
fields.setField("name", tokenizer.nextToken());
fields.setField("abbr", tokenizer.nextToken());
fields.setField("capital", tokenizer.nextToken());
fields.setField("city", tokenizer.nextToken());
fields.setField("population", tokenizer.nextToken());
fields.setField("surface", tokenizer.nextToken());
fields.setField("timezone1", tokenizer.nextToken());
fields.setField("timezone2", tokenizer.nextToken());
fields.setField("dst", tokenizer.nextToken());
stamper.setFormFlattening(true);
stamper.close();
reader.close();
// add the PDF to PdfCopy
reader = new PdfReader(baos.toByteArray());
copy.addDocument(reader);
reader.close();
}
br.close();
document.close();
}
These are three scenarios. Your question is too unclear for anyone but you to decide which scenario is the best fit for your needs. I suggest that you take the time to learn before you code. Watch the video, try the examples, and if you still have doubts, you'll be able to post a smarter question.

Related

Delete paragraph from PDF - Java

Greeting,
What is the easiest way to delete text/paragraph from a PDF document. The program takes a PDF document and creates a separate PDF document for each page. In each document I have text from the original that I would like to delete.
I tried a couple of examples but it doesn't work or they use old libraries
I am using the iText 7 library
private void processPDF(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfDictionary dict = reader.getPageN(1);
PdfObject object = dict.getDirectObject(PdfName.CONTENTS);
if (object instanceof PRStream) {
PRStream stream = (PRStream) object;
byte[] data = PdfReader.getStreamBytes(stream);
String dd = new String(data, "UTF8")
.replace("Hand made software", "");
stream.setData(dd.getBytes("UTF8"));
if (dd.contains("Hand made software")) {
System.out.println("Contains");
}
}
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.close();
reader.close();
}
private void processPDF2(String src, String dest) throws InvalidPasswordException, IOException {
Map<String, String> map = new HashMap<>();
map.put("Hand made software", "");
File template = new File(src);
PDDocument document = PDDocument.load(template);
List<PDField> fields = document.getDocumentCatalog().getAcroForm().getFields();
for (PDField field : fields) {
for (Map.Entry<String, String> entry : map.entrySet()) {
if (entry.getKey().equals(field.getFullyQualifiedName())) {
field.setValue(entry.getValue());
field.setReadOnly(true);
}
}
}
File out = new File(dest);
document.save(out);
document.close();
}
I want to delete line "Hand made software"
You can make it easy iterating through the PDF elements. First create a PDF reader and writter that will read the template located on src string path.
File template = new File(src);
PdfReader reader = new PdfReader(template);
File out = new File(dest);
PdfWritter writter = new PdfWritter(out);
Then create a Document object by firstly creating a PdfDocument:
PdfDocument pdf = new PdfDocument(reader, writter);
Document document = new Document(pdf);
Last iterate through the elements of the pdf document until "Hand made software" line is found:
for (int i = 0; i < document.getRoots().size(); i++) {
if (document.getRoots().get(i) instanceof Paragraph) { //iterate only through paragraphs
Paragraph paragraph = (Paragraph) document.getRoots().get(i);
if(paragraph.getText().equals("Hand made software")){ //if the paragraph equals to the string to be removed, remove from the document
document.getRoots().remove(i);
i--;
}
}
}
Finally close de document to save the changes
document.close();

merge multiple pdfs in order

hey guys sorry for long post and bad language and if there is unnecessary details
i created multiple 1page pdfs from one pdf template using excel document
i have now
something like this
tempfile0.pdf
tempfile1.pdf
tempfile2.pdf
...
im trying to merge all files in one single pdf using itext5
but it semmes that the pages in the resulted pdf are not in the order i wanted
per exemple
tempfile0.pdf in the first page
tempfile1. int the 2000 page
here is the code im using.
the procedure im using is:
1 filling a from from a hashmap
2 saving the filled form as one pdf
3 merging all the files in one single pdf
public void fillPdfitext(int debut,int fin) throws IOException, DocumentException {
for (int i =debut; i < fin; i++) {
HashMap<String, String> currentData = dataextracted[i];
// textArea.appendText("\n"+pdfoutputname +" en cours de preparation\n ");
PdfReader reader = new PdfReader(this.sourcePdfTemplateFile.toURI().getPath());
String outputfolder = this.destinationOutputFolder.toURI().getPath();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(outputfolder+"\\"+"tempcontrat"+debut+"-" +i+ "_.pdf"));
// get the document catalog
AcroFields acroForm = stamper.getAcroFields();
// as there might not be an AcroForm entry a null check is necessary
if (acroForm != null) {
for (String key : currentData.keySet()) {
try {
String fieldvalue=currentData.get(key);
if (key=="ADRESSE1"){
fieldvalue = currentData.get("ADRESSE1")+" "+currentData.get("ADRESSE2") ;
acroForm.setField("ADRESSE", fieldvalue);
}
if (key == "IMEI"){
acroForm.setField("NUM_SERIE_PACK", fieldvalue);
}
acroForm.setField(key, fieldvalue);
// textArea.appendText(key + ": "+fieldvalue+"\t\t");
} catch (Exception e) {
// e.printStackTrace();
}
}
stamper.setFormFlattening(true);
}
stamper.close();
}
}
this is the code for merging
public void Merge() throws IOException, DocumentException
{
File[] documentPaths = Main.objetapp.destinationOutputFolder.listFiles((dir, name) -> name.matches( "tempcontrat.*\\.pdf" ));
Arrays.sort(documentPaths, NameFileComparator.NAME_INSENSITIVE_COMPARATOR);
byte[] mergedDocument;
try (ByteArrayOutputStream memoryStream = new ByteArrayOutputStream())
{
Document document = new Document();
PdfSmartCopy pdfSmartCopy = new PdfSmartCopy(document, memoryStream);
document.open();
for (File docPath : documentPaths)
{
PdfReader reader = new PdfReader(docPath.toURI().getPath());
try
{
reader.consolidateNamedDestinations();
PdfImportedPage pdfImportedPage = pdfSmartCopy.getImportedPage(reader, 1);
pdfSmartCopy.addPage(pdfImportedPage);
}
finally
{
pdfSmartCopy.freeReader(reader);
reader.close();
}
}
document.close();
mergedDocument = memoryStream.toByteArray();
}
FileOutputStream stream = new FileOutputStream(this.destinationOutputFolder.toURI().getPath()+"\\"+
this.sourceDataFile.getName().replaceFirst("[.][^.]+$", "")+".pdf");
try {
stream.write(mergedDocument);
} finally {
stream.close();
}
documentPaths=null;
Runtime r = Runtime.getRuntime();
r.gc();
}
my question is how to keep the order of the files the same in the resulting pdf
It is because of naming of files. Your code
new FileOutputStream(outputfolder + "\\" + "tempcontrat" + debut + "-" + i + "_.pdf")
will produce:
tempcontrat0-0_.pdf
tempcontrat0-1_.pdf
...
tempcontrat0-10_.pdf
tempcontrat0-11_.pdf
...
tempcontrat0-1000_.pdf
Where tempcontrat0-1000_.pdf will be placed before tempcontrat0-11_.pdf, because you are sorting it alphabetically before merge.
It will be better to left pad file number with 0 character using leftPad() method of org.apache.commons.lang.StringUtils or java.text.DecimalFormat and have it like this tempcontrat0-000000.pdf, tempcontrat0-000001.pdf, ... tempcontrat0-9999999.pdf.
And you can also do it much simpler and skip writing into file and then reading from file steps and merge documents right after the form fill and it will be faster. But it depends how many and how big documents you are merging and how much memory do you have.
So you can save the filled document into ByteArrayOutputStream and after stamper.close() create new PdfReader for bytes from that stream and call pdfSmartCopy.getImportedPage() for that reader. In short cut it can look like:
// initialize
PdfSmartCopy pdfSmartCopy = new PdfSmartCopy(document, memoryStream);
for (int i = debut; i < fin; i++) {
ByteArrayOutputStream out = new ByteArrayOutputStream();
// fill in the form here
stamper.close();
PdfReader reader = new PdfReader(out.toByteArray());
reader.consolidateNamedDestinations();
PdfImportedPage pdfImportedPage = pdfSmartCopy.getImportedPage(reader, 1);
pdfSmartCopy.addPage(pdfImportedPage);
// other actions ...
}

Removing all embedded files in iTextSharp

As I've become interested in iTextSharp I need to learn C#. Since I know a bit of AutoHotkey (simple yet powerful script programming language for Windows) it is easer for me. However, I often come across code written in Java which is said to be easily converted to C#. Unfortunetly, I have some problems with it. Let's have a look at original code written by Bruno Lowagie.
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfDictionary root = reader.getCatalog();
PdfDictionary names = root.getAsDict(PdfName.NAMES);
names.remove(PdfName.EMBEDDEDFILES);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.close();
}
This is what I have managed to write on my own:
static void removeFiles(string sourceFilePath, string destFilePath)
{
try
{
// read src file
FileStream inputStream = new FileStream(sourceFilePath, FileMode.Open, FileAccess.Read, FileShare.None);
Document source = new Document();
// open for reading
PdfWriter reader = PdfReader.GetInstance(inputStream);
source.Open();
// create dest file
FileStream outputStream = new FileStream(destFilePath, FileMode.Create, FileAccess.ReadWrite, FileShare.None);
Document dest = new Document();
PdfWriter writer = PdfWriter.GetInstance(dest, inputStream); // open stream from src
// remove embedded files from dest
PdfDictionary root = dest.getCatalog().getPdfObject();
PdfDictionary names = root.getAsDictionary(PdfName.Names);
names.remove(PdfName.EmbeddedFiles);
// close all
source.Close();
dest.Close();
}
catch (Exception ex)
{
}
}
Unfortunately, there are many errors such as:
'Document' does not contain a definition for 'getCatalog' and no extension method 'getCatalog'
'PdfReader' does not contain a definition for 'GetInstance'
This is what I have managed to do after countless hours of coding and googling.
There are some iTextSharp examples available. See for instance: How to read a PDF Portfolio using iTextSharp
I don't know much about C#, but this is a first attempt to fix your code:
static void RemoveFiles(string sourceFilePath, string destFilePath)
{
// read src file
FileStream inputStream = new FileStream(sourceFilePath, FileMode.Open, FileAccess.Read, FileShare.None);
// open for reading
PdfReader reader = new PdfReader(inputStream);
FileStream outputStream = new FileStream(destFilePath, FileMode.Create, FileAccess.ReadWrite, FileShare.None);
PdfStamper stamper = new PdfStamper(reader, outputStream);
// remove embedded files
PdfDictionary root = reader.Catalog;
PdfDictionary names = root.GetAsDict(PdfName.NAMES);
names.Remove(PdfName.EMBEDDEDFILES);
// close all
stamper.Close();
reader.Close();
}
Note that I don't understand why you were using Document and PdfWriter. You should use PdfStamper instead. Also: you are only removing the document-level attachments. If there are file attachment annotations, they will still be present in the PDF.

iTextpdf 5.5.8 With JavaEE method PdfCopy addDocument undefined

I am trying to create a pdf with pdfcopy, which is composed with 2 sides of a card.
I am running this with Jetty. but i keep getting this error :
[ERROR] The method addDocument(PdfReader) is undefined for the type PdfCopy
My code :
import com.itextpdf.text.pdf.PdfCopy;
public static File exportCards(Person p) throws IOException, DocumentException {
Document document = new Document();
File temp = File.createTempFile("temp_file_name_", ".tmp");
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(temp));
document.open();
PdfReader reader;
PdfReader reader_2;
reader = new PdfReader("FrontStatic.pdf");
reader_2 = new PdfReader("BackStatic.pdf");
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(temp+"_f"));
PdfStamper stamper_2 = new PdfStamper(reader_2, new FileOutputStream(temp+"_r"));
AcroFields form = stamper.getAcroFields();
AcroFields form_2 = stamper_2.getAcroFields();
form.removeXfa();
//Recto
form.setField("form1[0].#subform[0].firstname[0]", p.getFirstName());
form.setField("form1[0].#subform[0].lastname[0]", p.getLastName());
//Verso
form_2.setField("form1[0].#subform[0].number[0]", String.valueOf(p.getIdentityID()));
copy.addDocument(reader);
copy.addDocument(reader_2);
stamper.close();
stamper_2.close();
reader.close();
reader_2.close();
//copy.close();
//document.close();
return temp;
i run the same code on a new program to test only this part, without anything else, and it's running, at least i get the 2 temp file "Stamp" with my data. I do not get the merged document.
Even if i still have a problem with my method to get the final document the error occures only with de Server Web program.
Does someone have an idea ?
Thank you for your time.

function that can use iText to concatenate / merge pdfs together - causing some issues

I'm using the following code to merge PDFs together using iText:
public static void concatenatePdfs(List<File> listOfPdfFiles, File outputFile) throws DocumentException, IOException {
Document document = new Document();
FileOutputStream outputStream = new FileOutputStream(outputFile);
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent();
for (File inFile : listOfPdfFiles) {
PdfReader reader = new PdfReader(inFile.getAbsolutePath());
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
document.newPage();
PdfImportedPage page = writer.getImportedPage(reader, i);
cb.addTemplate(page, 0, 0);
}
}
outputStream.flush();
document.close();
outputStream.close();
}
This usually works great! But once and a while, it's rotating some of the pages by 90 degrees? Anyone ever have this happen?
I am looking into the PDFs themselves to see what is special about the ones that are being flipped.
There are errors once in a while because you are using the wrong method to concatenate documents. Please read chapter 6 of my book and you'll notice that using PdfWriter to concatenate (or merge) PDF documents is wrong:
You completely ignore the page size of the pages in the original document (you assume they are all of size A4),
You ignore page boundaries such as the crop box (if present),
You ignore the rotation value stored in the page dictionary,
You throw away all interactivity that is present in the original document, and so on.
Concatenating PDFs is done using PdfCopy, see for instance the FillFlattenMerge2 example:
Document document = new Document();
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(dest));
document.open();
PdfReader reader;
String line = br.readLine();
// loop over readers
// add the PDF to PdfCopy
reader = new PdfReader(baos.toByteArray());
copy.addDocument(reader);
reader.close();
// end loop
document.close();
There are other examples in the book.
In case anyone is looking for it, using Bruno Lowagie's correct answer above, here is the version of the function that does not seem to have the page flipping issue i described above:
public static void concatenatePdfs(List<File> listOfPdfFiles, File outputFile) throws DocumentException, IOException {
Document document = new Document();
FileOutputStream outputStream = new FileOutputStream(outputFile);
PdfCopy copy = new PdfSmartCopy(document, outputStream);
document.open();
for (File inFile : listOfPdfFiles) {
PdfReader reader = new PdfReader(inFile.getAbsolutePath());
copy.addDocument(reader);
reader.close();
}
document.close();
}

Categories

Resources