iText pdf Multiple Pages with same Content - java

How can i generate pdf report of multiple pages with same content on each page. Following is the code for single page report. Multiple pages should be in a single pdf file.
<%
response.setContentType( "application/pdf" );
response.setHeader ("Content-Disposition","attachment;filename=TEST1.pdf");
Document document=new Document(PageSize.A4,25,25,35,0);
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
PdfWriter writer=PdfWriter.getInstance( document, buffer);
document.open();
Font fontnormalbold = FontFactory.getFont("Arial", 10, Font.BOLD);
Paragraph p1=new Paragraph("",fontnormalbold);
float[] iwidth = {1f,1f,1f,1f,1f,1f,1f,1f};
float[] iwidth1 = {1f};
PdfPTable table1 = new PdfPTable(iwidth);
table1.setWidthPercentage(100);
PdfPCell cell = new PdfPCell(new Paragraph("Testing Page",fontnormalbold));
cell.setHorizontalAlignment(1);
cell.setColspan(8);
cell.setPadding(5.0f);
table1.addCell(cell);
PdfPTable outerTable = new PdfPTable(iwidth1);
outerTable.setWidthPercentage(100);
PdfPCell containerCell = new PdfPCell();
containerCell.addElement(table1);
outerTable.addCell(containerCell);
p1.add(outerTable);
document.add(new Paragraph(p1));
document.close();
DataOutput output = new DataOutputStream( response.getOutputStream() );
byte[] bytes = buffer.toByteArray();
response.setContentLength(bytes.length);
for( int i = 0; i < bytes.length; i++ ) { output.writeByte( bytes[i] ); }
response.getOutputStream().flush();
response.getOutputStream().close();
%>

There are different way to solve this problem. Not all of the solutions are elegant.
Approach 1: add the same table many times.
I see that you are creating a PdfPTable object named outerTable. I'm going to ignore the silly things you do with this table (e.g. why are you adding this table to a Paragraph? Why are you adding a single cell with colspan 8 to a table with 8 columns? Why are you nesting this table into a table with a single column? All of these shenanigans are really weird), but having that outertable, you could do this:
for (int i = 0; i < x; i++) {
document.add(outerTable);
document.newPage();
}
This will add the table x times and it will start a new page for every table. This is also what the people in the comments advised you, and although the code looks really elegant, it doesn't result in an elegant PDF. That is: if you were my employee, I'd fire you if you did this.
Why? Because adding a table requires CPU and you are using x times the CPU you need. Moreover, with every table you create, you create new content streams. The same content will be added x times to your document. Your PDF will be about x times bigger than it should be.
Why would this be a reason to fire a developer? Because applications like this usually live in the cloud. In the cloud, one usually pays for CPU and bandwidth. A developer who writes code that requires a multiple of CPU and bandwidth, causes a cost that is unacceptable. In many cases, it is more cost-efficient to fire bad developers, hire slightly more expensive developers and buy slightly more expensive software, and then save plenty of money on the long term thanks to code that is more efficient in terms of CPU and band-width.
Approach 2: add the table to a PdfTemplate, reuse the PdfTemplate.
Please take a look at my answer to the StackOverflow question How to resize a PdfPTable to fit the page?
In this example, I create a PdfPTable named table. I know how wide I want the table to be (PageSize.A4.getWidth()), but I don't know in advance how high it will be. So I lock the width, I add the cells I need to add, and then I can calculate the height of the table like this: table.getTotalHeight().
I create a PdfTemplate that is exactly as big as the table:
PdfContentByte canvas = writer.getDirectContent();
PdfTemplate template = canvas.createTemplate(
table.getTotalWidth(), table.getTotalHeight());
I now add the table to this template:
table.writeSelectedRows(0, -1, 0, table.getTotalHeight(), template);
I wrap the table inside an Image object. This doesn't mean we're rasterizing the table, all text and lines are preserved as vector-data.
Image img = Image.getInstance(template);
I scale the img so that it fits the page size I have in mind:
img.scaleToFit(PageSize.A4.getWidth(), PageSize.A4.getHeight());
Now I position the table vertically in the middle.
img.setAbsolutePosition(
0, (PageSize.A4.getHeight() - table.getTotalHeight()) / 2);
If you want to add the table multiple times, this is how you'd do it:
for (int i = 0; i < x; i++) {
document.add(img);
document.newPage();
}
What is the difference with Approach 1? Well, by using PdfTemplate, you are creating a Form XObject. A Form XObject is a content stream that is external to the page stream. A Form XObject is stored in the PDF file only once, and it can be reused many times, e.g. on every page of a document.
Approach 3: create a PDF document with a single page; concatenate the file many times
You are creating your PDF in memory. The PDF is stored in the buffer object. You could read this PDF using PdfReader like this:
PdfReader reader = new PdfReader(buffer.toByteArray());
Then you reuse this content like this:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Document doc = new Document();
PdfSmartCopy copy = new PdfSmartCopy(doc, baos);
doc.open();
for (int i = 0; i < x; i++) {
copy.addDocument(reader);
}
doc.close();
reader.close();
Now you can send the bytes stored in baos to the OutputStream of your response object. Make sure that you use PdfSmartCopy instead of PdfCopy. PdfCopy just copies the pages AS-IS without checking if there is redundant information. The result is a bloated PDF similar to the one you'd get if you'd use Approach 1. PdfSmartCopy looks at the bytes of the content streams and will detect that you're adding the same page over and over again. That page will be reused the same way as is done in Approach 2.

Related

Replace itext to pdfbox performance

I am evaluating to replace our pdf processing from itext to pdfbox. I did some tests with 200 pdfs with a single page (94KB, 469KB, 937KB) and merged them to one pdf in our application. PDFBox version: 2.0.23.
itext version: 2.1.7. Here are the test results:
Here is the itext implementation:
byte[] l_PDFPage = null;
PdfReader l_PDFReader = null;
PdfCopy l_Copier = null;
Document l_PDFDocument = null;
OutputStream l_Stream = new FileOutputStream(m_File);
// do it for all pages in the editor
for( int i = 0; i < m_Editor.getCountOfElements(); i++ ) {
l_Page = m_Editor.getPageAt(i);
l_PDFPage = l_Page.getAsPdf();
l_PDFReader = new PdfReader(l_PDFPage);
l_PDFReader.getPageN(1).put(PdfName.ROTATE, new PdfNumber(l_PDFReader.getPageRotation(1) + l_Page.getRotation() % 360));
l_PDFReader.consolidateNamedDestinations();
if( i == 0 ) {
l_PDFDocument = new Document(l_PDFReader.getPageSizeWithRotation(1));
l_Copier = new PdfCopy(l_PDFDocument, l_Stream);
l_PDFDocument.open();
}
l_Copier.addPage(l_Copier.getImportedPage(l_PDFReader, 1));
if( l_PDFReader.getAcroForm() != null )
l_Copier.copyAcroForm(l_PDFReader);
l_Copier.flush();
l_Copier.freeReader(l_PDFReader);
}
l_PDFDocument.close();
l_Stream.close();
Here is the pdfbox implementation:
byte[] l_PDFPage = null;
List<PDDocument> pageDocuments = new ArrayList<>();
PDDocument saveDocument = new PDDocument();
try {
// do it for all pages in the editor
for( int i = 0; i < m_Editor.getCountOfElements(); i++ ) {
// our wrapper object for a page
l_Page = m_Editor.getPageAt(i);
// page as byte[]
l_PDFPage = l_Page.getAsPdf();
PDDocument document = PDDocument.load(l_PDFPage);
// save page document to close it later
pageDocuments.add(document);
PDPage page = document.getPage(0);
saveDocument.addPage(saveDocument.importPage(page));
}
saveDocument.save(l_Stream);
}
finally {
// close every page document
for(PDDocument doc : pageDocuments) {
doc.close();
}
saveDocument.close();
}
I have also tried using pdfmerger of pdfbox. The performance was nearly the same as the other pdfbox implementation. But with the 937KB files I run in an outofmemory exception with this implementation:
byte[] l_PDFPage = null;
OutputStream l_Stream = new FileOutputStream(m_File);
PDFMergerUtility merger = new PDFMergerUtility();
// do it for all pages in the editor
for( int i = 0; i < m_Editor.getCountOfElements(); i++ ) {
l_Page = m_Editor.getPageAt(i);
// page as byte[]
l_PDFPage = l_Page.getAsPdf();
merger.addSource(new ByteArrayInputStream(l_PDFPage));
}
merger.setDestinationStream(l_Stream);
merger.mergeDocuments(null);
So my questions:
Why is the performance (needed time AND memory usage) of pdfbox so bad in comparison to itext?
Am I missing something in our pdfbox implementation?
Why I can't close the "page document" after I added the page in "saveDocument"? If i close it there I'd get an error while saving so I have to store the "page documents" and close them at the end.
PDFBox and iText are architecturally different and, therefore, perform differently well for different tasks.
In particular iText attempts to write out new contents early, in your case much of the page is written to the output already during
l_Copier.addPage(l_Copier.getImportedPage(l_PDFReader, 1));
and
l_PDFDocument.close();
eventually only finalizes the PDF and writes last remaining objects and the file trailer.
PDFBox on the other hand saves everything in the end at once:
saveDocument.save(l_Stream);
The approach of iText has the advantage of a smaller memory footprint (as you observed) and the disadvantage that you cannot change data of a page once it is written.
(As an aside: the iText architecture has changed from iText 5 to iText 7, in iText 7 you have the choice and can keep everything in memory, but the price here also is a big memory footprint.)
Thus,
Why is the performance (needed time AND memory usage) of pdfbox so bad in comparison to itext?
The difference in memory use can partially be explained by the above. Also in iText after
l_Copier.freeReader(l_PDFReader);
the PdfReader can be closed (which you leave to the garbage collection to do for you) to free its resources while in your PDFBox code you keep all the source documents open, holding the resources up to the end. (Actually I would have assumed that when you're using importPage, you needn't keep them.)
Concerning the time I'm not sure now. You should do some finer clocking and determine where exactly the extra time is used in PDFBox; thus, I second #Tilman's request for profiling data. I assume it's during the final save but that's only a hunch. Also such time differences might depend on structural details of the PDF in question and may be less extreme for other documents.

how to place text and images in small sized page using iText

I need to make a PDF page that looks something like this:
I'm having problems to make two columns that fit on a small sized page.
This is my code:
public void createSizedPdf(String dest) throws IOException, DocumentException {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(dest));
document.setMargins(5,5,5,5);
Rectangle one = new Rectangle(290,100);
one.setBackgroundColor(Color.YELLOW);
document.setPageSize(one);
document.open();
Paragraph consigneeName = new Paragraph("Ahmed");
Paragraph address = new Paragraph("Casa ST 121");
String codeBL = "14785236987541";
PdfContentByte cb = writer.getDirectContent();
Barcode128 code128 = new Barcode128();
code128.setBaseline(9);
code128.setSize(9);
code128.setCode(codeBL);
code128.setCodeType(Barcode128.CODE128);
Image code128Image = code128.createImageWithBarcode(cb, null, null);
Paragraph right = new Paragraph();
right.add(consigneeName);
right.add(address);
right.add(code128Image);
Chunk glue = new Chunk(new VerticalPositionMark());
Paragraph p = new Paragraph();
p.add(right);
p.add(new Chunk(glue));
p.add(code128Image);
document.add(p);
document.close();
}
One way to solve your problem, would be to create a template PDF with AcroForm fields. You could create a nice design manually, and then fill out the form programmatically by putting data (text, bar codes) at the appropriate places defined by the fields that act as placeholders.
Another way, is to create the PDF from scratch, which is the approach you seem to have taken, looking at your code.
Your question isn't entirely clear, in the sense that you share your code, but you don't explain the problem you are experiencing. As I already commented:
are you unable to scale images? are you unable to define a smaller
font size? are you unable to create a table with specific dimension?
You say I'm having problems to make two columns that fits in a small
sized page but you forgot to describe the problems.
You didn't give an answer to those questions, and that makes it very hard for someone to answer your question. The only thing a Stack Overflow reader could do, is to do your work in your place. That's not what Stack Overflow is for.
Moreover, the answer to your question is so trivial that it is hard for a Stack Overflow reader to understand why you posted a question.
You say you need to add data (text and bar codes) in two columns, you are actually saying that you want to create a table. This is an example of such a table:
If you look at the SmallTable example, you can see how it's built.
You want a PDF that measures 290 by 100 user units, with a margin of 5 user units on each side. This means that you have space for a table measuring 280 by 90 user units. Looking at your screen shot, I'd say that you have a column of 160 user units with and a column of 120 user units. I'd also say that you have three rows of 30 user units high each.
OK, then why don't you create a table based on those dimensions?
public void createPdf(String dest) throws IOException, DocumentException {
Rectangle small = new Rectangle(290,100);
Font smallfont = new Font(FontFamily.HELVETICA, 10);
Document document = new Document(small, 5, 5, 5, 5);
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(dest));
document.open();
PdfPTable table = new PdfPTable(2);
table.setTotalWidth(new float[]{ 160, 120 });
table.setLockedWidth(true);
PdfContentByte cb = writer.getDirectContent();
// first row
PdfPCell cell = new PdfPCell(new Phrase("Some text here"));
cell.setFixedHeight(30);
cell.setBorder(Rectangle.NO_BORDER);
cell.setColspan(2);
table.addCell(cell);
// second row
cell = new PdfPCell(new Phrase("Some more text", smallfont));
cell.setFixedHeight(30);
cell.setVerticalAlignment(Element.ALIGN_MIDDLE);
cell.setBorder(Rectangle.NO_BORDER);
table.addCell(cell);
Barcode128 code128 = new Barcode128();
code128.setCode("14785236987541");
code128.setCodeType(Barcode128.CODE128);
Image code128Image = code128.createImageWithBarcode(cb, null, null);
cell = new PdfPCell(code128Image, true);
cell.setBorder(Rectangle.NO_BORDER);
cell.setFixedHeight(30);
table.addCell(cell);
// third row
table.addCell(cell);
cell = new PdfPCell(new Phrase("and something else here", smallfont));
cell.setBorder(Rectangle.NO_BORDER);
cell.setHorizontalAlignment(Element.ALIGN_RIGHT);
table.addCell(cell);
document.add(table);
document.close();
}
In this example,
you learn how to change the font of the content in a cell,
you learn how to change horizontal and vertical alignment,
you learn how to scale a bar code so that it fits into a cell,
...
All of this functionality is explained in the official documentation. As I said before: you didn't explain the nature of your problem. What wasn't clear in the documentation for you? What is your question?

Can't remove whiteSpace in Java itext

Here i am combining 2 pdf documents using the Itext packages.
Merging was done successfully using the code below
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent();
for (InputStream in : list)
{
PdfReader reader = new PdfReader(in);
for (int i = 1; i <= reader.getNumberOfPages(); i++)
{
document.newPage();
//import the page from source pdf
PdfImportedPage page = writer.getImportedPage(reader, i);
//add the page to the destination pdf
cb.addTemplate(page, 0, 0);
}
}
outputStream.flush();
document.close();
outputStream.close();
Here the list is an InputStream List.
And outputStream is an output stream
The problem i am having is i want to append the PDFdocuments in the list after the 1st PDF is added
(i.e 1st PDF has 4 lines...i want the 2nd PDF to continue in the same page after the 4th line).
What i am getting is the 2nd PDF is added in the second page.
Is there any alternate keyword for document.newPage();
Can anyone help me with it.
Thanks would like to hear any responses:)
It depends on the requirements you have. As long as
you only are interested in the page contents of the merged PDFs, not in the page annotations and
the pages have no content but the text lines you mention, in particular no background graphics, watermarks, or header/footer lines,
you can you use either the
PdfDenseMergeTool from this answer or the
PdfVeryDenseMergeTool from this answer.
If you are interested in annotations, it should be no problem to extend those classes accordingly. If your PDDFs have background graphics or watermarks, headers or footers, they should be removed beforehand.

iText Android - Adding text to existing PDF

we have a PDF with some fields in order to collect some data, and I have to fill it programmatically with iText on Android by adding some text in those positions. I've been thinking about different ways to achieve this, with little success in each one.
Note: I'm using the Android version of iText (iTextG 5.5.4) and a Samsung Galaxy Note 10.1 2014 (Android 4.4) for most of my tests.
The approach I took from the start was to "draw" the text on a given coordinates, for a given page. This has some problems with the management of the fields (I have to be aware of the length of the strings, and it could be hard to position each text in the exact coordinate of the pdf). But most importantly, the performance of the process is really slow in some devices/OSVersions (it works great in Nexus 5 with 5.0.2, but takes several minutes with a 5MB Pdf on the Note 10.1).
pdfReader = new PdfReader(is);
document = new Document();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
pdfCopy = new PdfCopy(document, baos);
document.open();
PdfImportedPage page;
PdfCopy.PageStamp stamp;
for (int i = 1; i <= pdfReader.getNumberOfPages(); i++) {
page = pdfCopy.getImportedPage(pdfReader, i); // First page = 1
stamp = pdfCopy.createPageStamp(page);
for (int i=0; i<10; i++) {
int posX = i*50;
int posY = i*100;
Phrase phrase = new Phrase("Example text", FontFactory.getFont(FontFactory.HELVETICA, 12, BaseColor.RED));
ColumnText.showTextAligned(stamp.getOverContent(), Element.ALIGN_CENTER, phrase, posX, posY, 0);
}
stamp.alterContents();
pdfCopy.addPage(page);
}
We though about adding "forms fields" instead of drawing. That way I can configure a TextField and avoid managing the texts myself. However, the final PDF shouldn't have any annotations, so I would need to copy it into a new Pdf without annotations and with those "forms fields" drawn. I don't have an example of this because I wasn't able to perform this, I don't even know if this is possible/worthwhile.
The third option would be to receive a Pdf with the "forms fields" already added, that way I only have to fill them. However I still need to create a new Pdf with all those fields and without annotations...
I'd like to know what's be the best way in performance to do this process, and any help about achieving it. I am really newbie with iText and any help would be really appreciated.
Thanks!
EDIT
At the end I used the third option: a PDF with editable fields that we fill, and then we use the "flattening" to create a non-editable PDF with all texts already there.
The code is as follows:
pdfReader = new PdfReader(is);
FileOutputStream fios = new FileOutputStream(outPdf);
PdfStamper pdfStamper = new PdfStamper(pdfReader, fios);
//Filling the PDF (It's totally necessary that the PDF has Form fields)
fillPDF(pdfStamper);
//Setting the PDF to uneditable format
pdfStamper.setFormFlattening(true);
pdfStamper.close();
and the method to fill the forms:
public static void fillPDF(PdfStamper stamper) throws IOException, DocumentException{
//Getting the Form fields from the PDF
AcroFields form = stamper.getAcroFields();
Set<String> fields = form.getFields().keySet();
for(String field : fields){
form.setField("name", "Ernesto");
form.setField("surname", "Lage");
}
}
}
The only thing about this approach is that you need to know the name of each field in order to fill it.
There is a process in iText known as 'flattening', which takes the form fields, and replaces them with the text that the fields contain.
I haven't used iText in a few years (and not at all on Android), but if you search the manual or online examples for 'flattening', you should find how to do it.

iText Pdf Page Byte Size

I have a business requirement that requires me to splits pdfs into multiple documents.
Lets say I have a 100MB pdf, I need to split that into for simplicity sake, into multiple pdfs no larger than 10MB a piece.
I am using iText.
I am going to get the original pdf, and loop through the pages, but how can I determine the file size of each page without writing it separately to the disk?
Sample code for simplicity
int numPages = reader.getNumberOfPages();
PdfImportedPage page;
for (int currentPage = 0; currentPage &lt numPages; ){
++currentPage;
//Get page from reader
page = writer.getImportedPage(reader, currentPage);
// I need the size in bytes here of the page
}
I think the easiest way is to write it to the disk and delete it afterwards:
Document document = new Document();
File f= new File("C:\\delete.pdf"); //for instance
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(f));
document.open();
document.add(page);
document.close();
long filesize = f.length(); //this is the filesize in byte
f.delete();
I'm not absolutely sure, I admit, but I don't know how it should be possible to figure out the filesize if the file is not existing.

Categories

Resources