Modifying an existing document using pdfstamper - java

I have a pdf which is half static and half dynamic which can grow multiple pages. I created the static part in Adobe LiveCycle and using itext to create the dynamic part. The dynamic part of the form is a table which has to expand based on the input across multiple pages. The form has acrofields in both the parts.
I have used columntext and pdfstamper to add the content to the existing pdf and the table grows dynamically which is working fine. The problems are
In every table cell, an acrofield needs to get added. I used pdfcell event to create it but after some googling, I could only find code using pdfwriter but not using pdf stamper.
on the first page, how to restrict the table content so that it doesn't go till the end of the page and I can insert page numbers at the bottom ?
I need to add a signature field at the end of the table. how do I know the coordinates of the end of the dynamic table?
My code snippet for dynamic table part:
ColumnText column = new ColumnText(stamper.getOverContent(1));
Rectangle rectPage1 = new Rectangle(792, 270);
column.setSimpleColumn(rectPage1);
column.addElement(table);
int pagecount = 1;
Rectangle rectPage2 = new Rectangle(792, 540);
int status = column.go();
while (ColumnText.hasMoreText(status) ) {
status = triggerNewPage(stamper, pagesize, column, rectPage2, ++pagecount);
}
public int triggerNewPage(PdfStamper stamper, Rectangle pagesize, ColumnText column, Rectangle rect, int pagecount) throws DocumentException {
stamper.insertPage(pagecount, pagesize);
PdfContentByte canvas = stamper.getOverContent(pagecount);
column.setCanvas(canvas);
column.setSimpleColumn(rect);
return column.go();
}

Related

IText 7 How To Add Div or Paragraph in Header Without Overlapping Page Content?

I am facing the following problem for which i haven't found any solution yet. I am implementing a platform for a medical laboratory. They want for every incident to write the report to the system and then generate and print it from the system. I am using itext 7 to accomplish this. However i am facing the following problem.
They have a very strange template. On the first page in the beginning they want to print a specific table, while in the beginning of every other page they want to print something else. So i need to know when pages change in order to print in the beginning of the page the corresponding table.
After reading various sources i ended up creating the first page normally and then adding a header event handler that checks the page number and gets executed always except page 1.
public class VariableHeaderEventHandler implements IEventHandler {
#Override
public void handleEvent(Event event) {
System.out.println("THIS IS ME: HEADER EVENT HANDLER STARTED.....");
PdfDocumentEvent documentEvent = (PdfDocumentEvent) event;
PdfDocument pdfDoc = documentEvent.getDocument();
PdfPage page = documentEvent.getPage();
Rectangle pageSize = page.getPageSize();
int pageNumber = pdfDoc.getPageNumber(page);
if (pageNumber == 1) return; //Do nothing in the first page...
System.out.println("Page size: " + pageSize.getHeight());
Rectangle rectangle = new Rectangle(pageSize.getLeft() + 30, pageSize.getHeight()-234, pageSize.getWidth() - 60, 200);
PdfCanvas pdfCanvas = new PdfCanvas(page.newContentStreamBefore(), page.getResources(), pdfDoc);
pdfCanvas.rectangle(rectangle);
pdfCanvas.setFontAndSize(FontsAndStyles.getRegularFont(), 10);
Canvas canvas = new Canvas(pdfCanvas, pdfDoc, rectangle);
Div header = new Div();
Paragraph paragraph = new Paragraph();
Text text = new Text("Διαγνωστικό Εργαστήριο Ιστοπαθολογίας και Μοριακής Παθολογοανατομικής").addStyle(FontsAndStyles.getBoldStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Μοριακή Διάγνωση σε Συνεργασία με").addStyle(FontsAndStyles.getBoldStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Γκούρβας Βίκτωρας, M.D., Ph.D.").addStyle(FontsAndStyles.getBoldStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Τσιμισκή 33, Τ.Κ. 54624, ΘΕΣΣΑΛΟΝΙΚΗ").addStyle(FontsAndStyles.getNormalStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Τήλ/Φάξ: 2311292924 Κιν.: 6932104909 e-mail: vgourvas#gmail.com").addStyle(FontsAndStyles.getNormalStyle());
paragraph.add(text);
header.add(paragraph);
// =============Horizontal Line BOLD============
SolidLine solidLine = new SolidLine((float) 1.5);
header.add(new LineSeparator(solidLine));
// ========Horizontal Line BOLD End==========
text = new Text("ΠΑΘΟΛΟΓΟΑΝΑΤΟΜΙΚΗ ΕΞΕΤΑΣΗ").addStyle(FontsAndStyles.getBoldStyle());
paragraph = new Paragraph().add(text);
header.add(paragraph);
header.setTextAlignment(TextAlignment.CENTER);
canvas.add(header);
canvas.close();
}
However the problem i am facing now is that header overlaps content and i can't figure out how to set different margins per page. For example form page 2 and beyond i would like different topMargin.
Has anyone faced these problems before and have found a working solution? Am I implementing correct? Is there a better way of accomplishing the same result?
Thanks in advance,
Toutoudakis Michail
You should create your own custom document renderer and decrease the area which would be used to place content for each page except for the first one.
Please look at the snippet below and updateCurrentArea method in particular.
class CustomDocumentRenderer extends DocumentRenderer {
public CustomDocumentRenderer(Document document) {
super(document);
}
#Override
public IRenderer getNextRenderer() {
return new CustomDocumentRenderer(this.document);
}
#Override
protected LayoutArea updateCurrentArea(LayoutResult overflowResult) {
LayoutArea area = super.updateCurrentArea(overflowResult);
if (currentPageNumber > 1) {
area.setBBox(area.getBBox().decreaseHeight(200));
}
return area;
}
}
Then just set the renderer on your document:
Document doc = new Document(pdfDoc);
doc.setRenderer(new CustomDocumentRenderer(doc));
The resultant pdf which I get for your document looks as follows:
There is another solution however. Once you've added at least one element to your document, you can change the default document's margins. The change will be applied on all pages created afterwards (and in your case these are pages 2, 3, ...)
doc.add(new Paragraph("At least one element should be added. Otherwise the first page wouldn't be created and changing of the default margins would affect it."));
doc.setMargins(200, 36, 36, 36);
// now you can be sure that all the next pages would have new margins

iText pdf Multiple Pages with same Content

How can i generate pdf report of multiple pages with same content on each page. Following is the code for single page report. Multiple pages should be in a single pdf file.
<%
response.setContentType( "application/pdf" );
response.setHeader ("Content-Disposition","attachment;filename=TEST1.pdf");
Document document=new Document(PageSize.A4,25,25,35,0);
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
PdfWriter writer=PdfWriter.getInstance( document, buffer);
document.open();
Font fontnormalbold = FontFactory.getFont("Arial", 10, Font.BOLD);
Paragraph p1=new Paragraph("",fontnormalbold);
float[] iwidth = {1f,1f,1f,1f,1f,1f,1f,1f};
float[] iwidth1 = {1f};
PdfPTable table1 = new PdfPTable(iwidth);
table1.setWidthPercentage(100);
PdfPCell cell = new PdfPCell(new Paragraph("Testing Page",fontnormalbold));
cell.setHorizontalAlignment(1);
cell.setColspan(8);
cell.setPadding(5.0f);
table1.addCell(cell);
PdfPTable outerTable = new PdfPTable(iwidth1);
outerTable.setWidthPercentage(100);
PdfPCell containerCell = new PdfPCell();
containerCell.addElement(table1);
outerTable.addCell(containerCell);
p1.add(outerTable);
document.add(new Paragraph(p1));
document.close();
DataOutput output = new DataOutputStream( response.getOutputStream() );
byte[] bytes = buffer.toByteArray();
response.setContentLength(bytes.length);
for( int i = 0; i < bytes.length; i++ ) { output.writeByte( bytes[i] ); }
response.getOutputStream().flush();
response.getOutputStream().close();
%>
There are different way to solve this problem. Not all of the solutions are elegant.
Approach 1: add the same table many times.
I see that you are creating a PdfPTable object named outerTable. I'm going to ignore the silly things you do with this table (e.g. why are you adding this table to a Paragraph? Why are you adding a single cell with colspan 8 to a table with 8 columns? Why are you nesting this table into a table with a single column? All of these shenanigans are really weird), but having that outertable, you could do this:
for (int i = 0; i < x; i++) {
document.add(outerTable);
document.newPage();
}
This will add the table x times and it will start a new page for every table. This is also what the people in the comments advised you, and although the code looks really elegant, it doesn't result in an elegant PDF. That is: if you were my employee, I'd fire you if you did this.
Why? Because adding a table requires CPU and you are using x times the CPU you need. Moreover, with every table you create, you create new content streams. The same content will be added x times to your document. Your PDF will be about x times bigger than it should be.
Why would this be a reason to fire a developer? Because applications like this usually live in the cloud. In the cloud, one usually pays for CPU and bandwidth. A developer who writes code that requires a multiple of CPU and bandwidth, causes a cost that is unacceptable. In many cases, it is more cost-efficient to fire bad developers, hire slightly more expensive developers and buy slightly more expensive software, and then save plenty of money on the long term thanks to code that is more efficient in terms of CPU and band-width.
Approach 2: add the table to a PdfTemplate, reuse the PdfTemplate.
Please take a look at my answer to the StackOverflow question How to resize a PdfPTable to fit the page?
In this example, I create a PdfPTable named table. I know how wide I want the table to be (PageSize.A4.getWidth()), but I don't know in advance how high it will be. So I lock the width, I add the cells I need to add, and then I can calculate the height of the table like this: table.getTotalHeight().
I create a PdfTemplate that is exactly as big as the table:
PdfContentByte canvas = writer.getDirectContent();
PdfTemplate template = canvas.createTemplate(
table.getTotalWidth(), table.getTotalHeight());
I now add the table to this template:
table.writeSelectedRows(0, -1, 0, table.getTotalHeight(), template);
I wrap the table inside an Image object. This doesn't mean we're rasterizing the table, all text and lines are preserved as vector-data.
Image img = Image.getInstance(template);
I scale the img so that it fits the page size I have in mind:
img.scaleToFit(PageSize.A4.getWidth(), PageSize.A4.getHeight());
Now I position the table vertically in the middle.
img.setAbsolutePosition(
0, (PageSize.A4.getHeight() - table.getTotalHeight()) / 2);
If you want to add the table multiple times, this is how you'd do it:
for (int i = 0; i < x; i++) {
document.add(img);
document.newPage();
}
What is the difference with Approach 1? Well, by using PdfTemplate, you are creating a Form XObject. A Form XObject is a content stream that is external to the page stream. A Form XObject is stored in the PDF file only once, and it can be reused many times, e.g. on every page of a document.
Approach 3: create a PDF document with a single page; concatenate the file many times
You are creating your PDF in memory. The PDF is stored in the buffer object. You could read this PDF using PdfReader like this:
PdfReader reader = new PdfReader(buffer.toByteArray());
Then you reuse this content like this:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Document doc = new Document();
PdfSmartCopy copy = new PdfSmartCopy(doc, baos);
doc.open();
for (int i = 0; i < x; i++) {
copy.addDocument(reader);
}
doc.close();
reader.close();
Now you can send the bytes stored in baos to the OutputStream of your response object. Make sure that you use PdfSmartCopy instead of PdfCopy. PdfCopy just copies the pages AS-IS without checking if there is redundant information. The result is a bloated PDF similar to the one you'd get if you'd use Approach 1. PdfSmartCopy looks at the bytes of the content streams and will detect that you're adding the same page over and over again. That page will be reused the same way as is done in Approach 2.

Adding table to existing PDF on the same page - ITEXT

I have two parts to my java project.
I need to populate the fields of a pdf
I need to add a table below the populated section on the blank area of the page (and this table needs to be able to rollover to the next page).
I am able to do these things separately (populate the pdf and create a table). But I cannot effectively merge them. I have tried doing a doc.add(table) which will result in the table being on the next page of the pdf, which I don't want.
I essentially just need to be able to specify where the table starts on the page (so it wouldn't overlap the existing content) and then stamp the table onto the existing pdf.
My other option if this doesn't work is trying to add fields to the original pdf that will be filled by the table contents (so it will instead be a field-based table).
Any suggestions?
EDIT:
I'm new to iText and have not used columntext before, but I'm trying to test it out in the following code but the table is not being displayed. I looked at other columntext examples and I have not seen exactly where the columntext is added back into the pdf.
//CREATE FILLED FORM PDF
PdfReader reader = new PdfReader(sourcePath);
PdfStamper pdfStamper = new PdfStamper(reader, new FileOutputStream(destPath));
pdfStamper.setFormFlattening(true);
AcroFields form = pdfStamper.getAcroFields();
form.setField("ID", "99999");
form.setField("ADDR1", "425 Test Street");
form.setField("ADDR2", "Test, WA 91334");
form.setField("PHNBR", "(999)999-9999");
form.setField("NAME", "John Smith");
//CREATE TABLE
PdfPTable table = new PdfPTable(3);
Font bfBold12 = new Font(FontFamily.HELVETICA, 12, Font.BOLD, new BaseColor(0, 0, 0));
insertCell(table, "Table", Element.ALIGN_CENTER, 1, bfBold12);
table.completeRow();
ColumnText column = new ColumnText(pdfStamper.getOverContent(1));
column.addElement(table);
pdfStamper.close();
reader.close();
Please take a look at the AddExtraTable example. It's a simplification of the AddExtraPage example written in answer to the question How to continue field output on a second page?
That question is almost an exact duplicate of your question, with as only difference the fact that your requirement is easier to achieve.
I simplified the code like this:
public void manipulatePdf(String src, String dest) throws DocumentException, IOException {
PdfReader reader = new PdfReader(src);
Rectangle pagesize = reader.getPageSize(1);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
AcroFields form = stamper.getAcroFields();
form.setField("Name", "Jennifer");
form.setField("Company", "iText's next customer");
form.setField("Country", "No Man's Land");
PdfPTable table = new PdfPTable(2);
table.addCell("#");
table.addCell("description");
table.setHeaderRows(1);
table.setWidths(new int[]{ 1, 15 });
for (int i = 1; i <= 150; i++) {
table.addCell(String.valueOf(i));
table.addCell("test " + i);
}
ColumnText column = new ColumnText(stamper.getOverContent(1));
Rectangle rectPage1 = new Rectangle(36, 36, 559, 540);
column.setSimpleColumn(rectPage1);
column.addElement(table);
int pagecount = 1;
Rectangle rectPage2 = new Rectangle(36, 36, 559, 806);
int status = column.go();
while (ColumnText.hasMoreText(status)) {
status = triggerNewPage(stamper, pagesize, column, rectPage2, ++pagecount);
}
stamper.setFormFlattening(true);
stamper.close();
reader.close();
}
public int triggerNewPage(PdfStamper stamper, Rectangle pagesize, ColumnText column, Rectangle rect, int pagecount) throws DocumentException {
stamper.insertPage(pagecount, pagesize);
PdfContentByte canvas = stamper.getOverContent(pagecount);
column.setCanvas(canvas);
column.setSimpleColumn(rect);
return column.go();
}
As you can see, the main differences are:
We create a rectPage1 for the first page and a rectPage2 for page 2 and all pages that follow. That's because we don't need a full page on the first page.
We don't need to load a PdfImportedPage, instead we're just adding blank pages of the same size as the first page.
Possible improvements: I hardcoded the Rectangle instances. It goes without saying that rect1Page depends on the location of your original form. I also hardcoded rect2Page. If I had more time, I would calculate rect2Page based on the pagesize value.
See the following questions and answers of the official FAQ:
How to add a table on a form (and maybe insert a new page)?
How to continue field output on a second page?

iText Android - Adding text to existing PDF

we have a PDF with some fields in order to collect some data, and I have to fill it programmatically with iText on Android by adding some text in those positions. I've been thinking about different ways to achieve this, with little success in each one.
Note: I'm using the Android version of iText (iTextG 5.5.4) and a Samsung Galaxy Note 10.1 2014 (Android 4.4) for most of my tests.
The approach I took from the start was to "draw" the text on a given coordinates, for a given page. This has some problems with the management of the fields (I have to be aware of the length of the strings, and it could be hard to position each text in the exact coordinate of the pdf). But most importantly, the performance of the process is really slow in some devices/OSVersions (it works great in Nexus 5 with 5.0.2, but takes several minutes with a 5MB Pdf on the Note 10.1).
pdfReader = new PdfReader(is);
document = new Document();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
pdfCopy = new PdfCopy(document, baos);
document.open();
PdfImportedPage page;
PdfCopy.PageStamp stamp;
for (int i = 1; i <= pdfReader.getNumberOfPages(); i++) {
page = pdfCopy.getImportedPage(pdfReader, i); // First page = 1
stamp = pdfCopy.createPageStamp(page);
for (int i=0; i<10; i++) {
int posX = i*50;
int posY = i*100;
Phrase phrase = new Phrase("Example text", FontFactory.getFont(FontFactory.HELVETICA, 12, BaseColor.RED));
ColumnText.showTextAligned(stamp.getOverContent(), Element.ALIGN_CENTER, phrase, posX, posY, 0);
}
stamp.alterContents();
pdfCopy.addPage(page);
}
We though about adding "forms fields" instead of drawing. That way I can configure a TextField and avoid managing the texts myself. However, the final PDF shouldn't have any annotations, so I would need to copy it into a new Pdf without annotations and with those "forms fields" drawn. I don't have an example of this because I wasn't able to perform this, I don't even know if this is possible/worthwhile.
The third option would be to receive a Pdf with the "forms fields" already added, that way I only have to fill them. However I still need to create a new Pdf with all those fields and without annotations...
I'd like to know what's be the best way in performance to do this process, and any help about achieving it. I am really newbie with iText and any help would be really appreciated.
Thanks!
EDIT
At the end I used the third option: a PDF with editable fields that we fill, and then we use the "flattening" to create a non-editable PDF with all texts already there.
The code is as follows:
pdfReader = new PdfReader(is);
FileOutputStream fios = new FileOutputStream(outPdf);
PdfStamper pdfStamper = new PdfStamper(pdfReader, fios);
//Filling the PDF (It's totally necessary that the PDF has Form fields)
fillPDF(pdfStamper);
//Setting the PDF to uneditable format
pdfStamper.setFormFlattening(true);
pdfStamper.close();
and the method to fill the forms:
public static void fillPDF(PdfStamper stamper) throws IOException, DocumentException{
//Getting the Form fields from the PDF
AcroFields form = stamper.getAcroFields();
Set<String> fields = form.getFields().keySet();
for(String field : fields){
form.setField("name", "Ernesto");
form.setField("surname", "Lage");
}
}
}
The only thing about this approach is that you need to know the name of each field in order to fill it.
There is a process in iText known as 'flattening', which takes the form fields, and replaces them with the text that the fields contain.
I haven't used iText in a few years (and not at all on Android), but if you search the manual or online examples for 'flattening', you should find how to do it.

Reading a table or cell value in a pdf file using java?

I have gone through Java and PDF forums to extract a text value from the table in a pdf file, but could't find any solution except JPedal (It's not opensource and licensed).
So, I would like to know any opensource API's like pdfbox, itext to achieve the same result as JPedal.
Ref. Example:
In comments the OP clarified that he locates the text value from the table in a pdf file he wants to extract
By providing X and Y co-ordinates
Thus, while the question initially sounded like generic extraction of tabular data from PDFs (which can be difficult at least), it actually is essentially about extracting the text from a rectangular region on a page given by coordinates.
This is possible using either of the libraries you mentioned (and surely others, too).
iText
To restrict the region from which you want to extract text, you can use the RegionTextRenderFilter in a FilteredTextRenderListener, e.g.:
/**
* Parses a specific area of a PDF to a plain text file.
* #param pdf the original PDF
* #param txt the resulting text
* #throws IOException
*/
public void parsePdf(String pdf, String txt) throws IOException {
PdfReader reader = new PdfReader(pdf);
PrintWriter out = new PrintWriter(new FileOutputStream(txt));
Rectangle rect = new Rectangle(70, 80, 490, 580);
RenderFilter filter = new RegionTextRenderFilter(rect);
TextExtractionStrategy strategy;
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter);
out.println(PdfTextExtractor.getTextFromPage(reader, i, strategy));
}
out.flush();
out.close();
reader.close();
}
(ExtractPageContentArea sample from iText in Action, 2nd edition)
Beware, though, iText extracts text based on the basic text chunks in the content stream, not based on each individual glyph in such a chunk. Thus, the whole chunk is processed if only the tiniest part of it is in the area.
This may or may not suit you.
If you run into the problem that more is extracted than you wanted, you should split the chunks into their constituting glyphs beforehand. This stackoverflow answer explains how to do that.
PDFBox
To restrict the region from which you want to extract text, you can use the PDFTextStripperByArea, e.g.:
PDDocument document = PDDocument.load( args[0] );
if( document.isEncrypted() )
{
document.decrypt( "" );
}
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
stripper.setSortByPosition( true );
Rectangle rect = new Rectangle( 10, 280, 275, 60 );
stripper.addRegion( "class1", rect );
List allPages = document.getDocumentCatalog().getAllPages();
PDPage firstPage = (PDPage)allPages.get( 0 );
stripper.extractRegions( firstPage );
System.out.println( "Text in the area:" + rect );
System.out.println( stripper.getTextForRegion( "class1" ) );
(ExtractTextByArea from the PDFBox 1.8.8 examples)
Try PDFTextStream. At least I am able to identify the column values. Earlier, I was using iText and got stuck in defining strategy. Its hard.
This api separates column cells by putting more spaces. Its fixed. you can put logic. (this was missing in iText).
import com.snowtide.PDF;
import com.snowtide.pdf.Document;
import com.snowtide.pdf.OutputTarget;
public class PDFText {
public static void main(String[] args) throws java.io.IOException {
String pdfFilePath = "xyz.pdf";
Document pdf = PDF.open(pdfFilePath);
StringBuilder text = new StringBuilder(1024);
pdf.pipe(new OutputTarget(text));
pdf.close();
System.out.println(text);
}
}
Question has been asked related to this on stackoverflow!

Categories

Resources