Itext PDF Shrink issue with hyperlink bounding box is not getting Shrinked - java

I am working with PDF shrinking and then watermarking it and for the same I am using itextpdf-5.5.1.jar. Here is the code which I use to shrink PDF. In code xPercentage and xPercentage value is 0.9f. When I shrink PDF having content table , content on the page is shrinking properly. When I go to table of content the bounding box of hyperlink is getting misplaced. I noticed that size of bounding box is same for Original and shrink output document. How do I shrink bounding box of hyperlink with respect to content?
public void shrinkPDF(String strFilePath , String strFileName) throws Exception{
PdfReader reader = new PdfReader(strFilePath+"//"+strFileName);
PdfStamper stamper = new PdfStamper(reader, new
FileOutputStream(strFilePath+"//Shrink_"+strFileName));
int n = reader.getNumberOfPages();
Map mpPDFLayer = stamper.getPdfLayers();
for (int p = 1; p <= n; p++) {
float offsetX = (reader.getPageSize(p).getWidth() * (1 - xPercentage)) / 2;
float offsetY = (reader.getPageSize(p).getHeight() * (1 - yPercentage)) / 2;
stamper.getUnderContent(p).setLiteral(
String.format("\nq %s 0 0 %s %s %s cm\nq\n",
xPercentage, yPercentage, offsetX, offsetY));
stamper.getOverContent(p).setLiteral("\nQ\nQ\n");
}
stamper.close();
reader.close();
}

Your code shrinks only the content but it does not accordingly move and shrink annotations. So what you have to do additionally is to iterate over the annotations of each page and shrink them.
This in particular means that you have to shrink and move the Rect annotation rectangle. Depending on the nature of the respective annotation, though, there also are other coordinate values in them, e.g. the QuadPoints in case of a link or the L endpoint coordinates of a line.
BTW, your content shrinking code makes assumptions on the origin of the user space coordinate system; it appears to assume that the origin is in the lower left of the crop box and that the crop box and the media box coincide.

Related

itext7 Java Create PdfExplicitDestination for titles in existing pdf

I am using PdfExplicitDestination as a page number, for titles by reading the existing pdf content from the page,
but I need to point the focus on specific text content while click on the bookmark.
for (int page = 1; page <= pdf.getNumberOfPages(); page++) {
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
String currentText = PdfTextExtractor.getTextFromPage(pdf.getPage(page), strategy);
if (currentText.contains("title")) {
k.addDestination(PdfExplicitDestination.createXYZ(pdf.getPage(page), pdf.getPage(page).getPageSize().getLeft(), pdf.getPage(page).getPageSize().getTop(), 0));
//System.out.println(currentText);
}
}
I need to find the position of the title in the pdf page to set "float top" value.
PdfExplicitDestination.createXYZ(pageNum, left, top, zoom)
Can any one please help to get it from the existing content in the pdf.
This task can be approached in a number of ways. One of the way is to go over page content in "stripes" (rectangles with small height), and only consider content from such a small rectangle at a time. If you find a text piece in such rectangle then you know that somewhere between upper and lower bound of Y position given by the rectangle coordinates lies the desired text content. You can e.g. create the destination to point to the topmost coordinate in that case - it might be a bit above the desired text but the difference will be small depending on the rectangle height you select.
The following code snipped contains example implementation of the presented idea. There are two parameters - windowHeight which must be tall enough to fit a piece of content you are looking for, but the smaller this variable is the better accuracy you get in the result. Parameter step defines how many such rectangles of height windowHeight we will try on each page. The smaller the parameter is the better accuracy you get, but bigger parameter values optimize performance. Up to a specific use case to tweak those trade-offs.
final float windowHeight = 30;
final float step = 10;
for (int page = 1; page <= pdf.getNumberOfPages(); page++) {
Rectangle pageSize = pdf.getPage(page).getPageSize();
for (float upperPoint = pageSize.getHeight(); upperPoint > 0; upperPoint -= step) {
IEventFilter filter = new TextRegionEventFilter(new Rectangle(0, upperPoint - windowHeight, pageSize.getWidth(), windowHeight));
LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
FilteredTextEventListener listener = new FilteredTextEventListener(strategy, filter);
new PdfCanvasProcessor(listener).processPageContent(pdf.getPage(page));
if (strategy.getResultantText().contains("title")) {
float top = upperPoint; // This is the topmost point of the rectangle
break; // Break here not to capture same text twice
}
}
}

Adding text top-right of PDF using ColumnText works for Portrait, not for Landscape

I am working on a project where as part of statements I need to attach arbitrary PDF files. These PDF files need to be marked by a title and page numbering, in the top-right corner of the PDF file. This is a legal requirement as these attachments are referred to by their title and total number of pages from the statements.
I (naively) hacked together some code that appears to be working on PDF files with pages in the Portrait orientation (at least the PDF files I tested with). However when I use this code on pages in a Landscape orientation, the title and numbering isn't visible.
The code:
PdfContentByte canvas = pdfStamper.getOverContent( pageNr );
Phrase phrase = new Phrase( sb.toString( ), new Font( FontFamily.HELVETICA, 9f ) ); // sb holds title + page numbering
float width = ColumnText.getWidth( phrase );
ColumnText.showTextAligned ( // draw text top-right
canvas,
Element.ALIGN_LEFT,
phrase,
canvas.getPdfDocument( ).right( ) - width, //x
canvas.getPdfDocument( ).top( ) + 9, //y
0 //rotation
);
Examples:
Portrait where it appears to work:
Landscape where it doesn't work:
Questions:
Where did I go wrong?
Is it possible to write such a piece of code that does it right for all possible page orientations?
If so, how?
You are adding the content, but you are adding it at the wrong place. See PageSize of PDF always the same between landscape and portrait with itextpdf
Let's assume that you are working with an A4 page using portrait orientation. That pages measures 595 by 842 user units. 595 is the width; 842 is the height.
Now let's switch to landscape. This can be done in two different ways:
define a width of 595 and a height of 842, and a rotation of 90 degrees.
define a width of 842 and a height of 595.
Which way is used to define the landscape orientation will have an impact on the value of the right() and top() method. I am pretty sure that you are adding the header to the landscape pages, but you are adding them outside the visible area of the page.
For those interested, I ended up doing it as follows. This works for both Portrait and Landscape orientations. This uses the PdfReader.getPageSizeWithRotation method to get the proper page size.
private String pageText(int pageNr, int pageTotal) {
return ""; // generate string to display top-right of PDF here
}
private void addDocumentObjects(int pageNr, PdfReader pdfReader, PdfStamper pdfStamper) {
final float pageMargin = 25f;
final float textSize = 9f;
final float lineMargin = 5f;
Phrase phrase = new Phrase (
pageText(pageNr, pdfReader.getNumberOfPages()),
new Font(FontFamily.HELVETICA, textSize)
);
final float phraseWidth = ColumnText.getWidth(phrase);
PdfContentByte canvas = pdfStamper.getOverContent(pageNr);
com.itextpdf.text.Rectangle pageRectangle = pdfReader.getPageSizeWithRotation(pageNr);
// draw white background rectangle before adding text + line
canvas.setColorFill(BaseColor.WHITE);
canvas.rectangle (
pageRectangle.getRight(pageMargin) - phraseWidth, //x
pageRectangle.getTop(pageMargin), //y
phraseWidth, // width
textSize + lineMargin //height
);
canvas.fill();
// draw text top right
canvas.setColorFill(BaseColor.BLACK);
ColumnText.showTextAligned (
canvas, //canvas
Element.ALIGN_LEFT, //alignment
phrase, //phrase
pageRectangle.getRight(pageMargin) - phraseWidth, //x
pageRectangle.getTop(pageMargin), //y
0 //rotation
);
// draw line under text
canvas.setColorStroke(BaseColor.BLACK);
canvas.setLineWidth(1);
canvas.moveTo (
pageRectangle.getRight(pageMargin) - phraseWidth, //x
pageRectangle.getTop(pageMargin) - lineMargin //y
);
canvas.lineTo (
pageRectangle.getRight(pageMargin), //x
pageRectangle.getTop(pageMargin) - lineMargin //y
);
canvas.stroke();
}

How to create Table using Apache PDFBox

We are planning to migrate our pdf generation utilities from iText to PDFBox (Due to licensing issues in iText). With some effort, I was able to write and position text, draw lines etc. But creating Tables with text embedded in Table cells is a challenge, I went through the documentation, examples, Google, Stackoverflow couldn't find a thing. Was wondering if PDFBox provides native support for creating Tables with embedded text. My last resort would be to use this link https://github.com/eduardohl/Paginated-PDFBox-Table-Sample
Since I also needed table drawing functionality for a side project, I implemented a small "table drawer" library myself, which I uploaded to github.
In order to produce such a table – for instance – ...
... you would need this code.
In the same file you find the code for that table as well:
The current "feature list" includes:
set font and font size on table level as well as on cell level
define single cells with bottom-, top-, left- and right-border width separately
define the background color on row or cell level
define padding (top, bottom, left, right) on cell level
define border color (on table, row or cell level)
specify text alignment (vertical and horizontal)
cell spanning and row spanning
text wrapping and line spacing
Also it should not be too hard to add missing stuff like having different border colors for borders on top, bottom, left and right-borders, if needed.
Thanks to the links provided by Tilman. Using the boxable API (https://github.com/dhorions/boxable) I was able to create the table I wanted to. Just an FYI I wanted to create the table with variable number of cells. For example row 1 would have 2 cells, row 2 could have 5 cells and row 3 could have just 3 cells. I was able to do with ease. I followed Example1.java in the link mentioned above.
This example code works for me. I think this would be helpful to you
public static void creteTablePdf() throws IOException {
PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage(page);
int pageWidth = (int)page.getTrimBox().getWidth(); //get width of the page
int pageHeight = (int)page.getTrimBox().getHeight(); //get height of the page
PDPageContentStream contentStream = new PDPageContentStream(document,page);
contentStream.setStrokingColor(Color.DARK_GRAY);
contentStream.setLineWidth(1);
int initX = 50;
int initY = pageHeight-50;
int cellHeight = 20;
int cellWidth = 100;
int colCount = 3;
int rowCount = 3;
for(int i = 1; i<=rowCount;i++){
for(int j = 1; j<=colCount;j++){
if(j == 2){
contentStream.addRect(initX,initY,cellWidth+30,-cellHeight);
contentStream.beginText();
contentStream.newLineAtOffset(initX+30,initY-cellHeight+10);
contentStream.setFont(PDType1Font.TIMES_ROMAN,10);
contentStream.showText("Dinuka");
contentStream.endText();
initX+=cellWidth+30;
}else{
contentStream.addRect(initX,initY,cellWidth,-cellHeight);
contentStream.beginText();
contentStream.newLineAtOffset(initX+10,initY-cellHeight+10);
contentStream.setFont(PDType1Font.TIMES_ROMAN,10);
contentStream.showText("Dinuka");
contentStream.endText();
initX+=cellWidth;
}
}
initX = 50;
initY -=cellHeight;
}
contentStream.stroke();
contentStream.close();
document.save("C:\\table.pdf");
document.close();
System.out.println("table pdf created");
}

iText - crop out a part of pdf file

I have a small problem and I'm trying for some time to find out a solution.
Long story short I have to remove the top part of each page from a pdf with itext. I managed to do this with CROPBOX, but the problem is that this will make the pages smaller by removing the top part.
Can someone help me to implement this so the page size remains the same. My idea would be to override the top page with a white rectangle, but after many tries I didn't manage to do this.
This is the current code I'm using to crop the page.
PdfRectangle rect = new PdfRectangle(55, 0, 1000, 1000);
PdfDictionary pageDict;
for (int curentPage = 2; curentPage <= pdfReader.getNumberOfPages(); curentPage++) {
pageDict = pdfReader.getPageN(curentPage);
pageDict.put(PdfName.CROPBOX, rect);
}
In your code sample, you are cropping the pages. This reduces the visible size of the page.
Based on your description, you don't want cropping. Instead you want clipping.
I've written an example that clips the content of all pages of a PDF by introducing a margin of 200 user units (that's quite a margin). The example is called ClipPdf and you can see a clipped page here: hero_clipped.pdf (the iText superhero has lost arms, feet and part of his head in the clipping process.)
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
int n = reader.getNumberOfPages();
PdfDictionary page;
PdfArray media;
for (int p = 1; p <= n; p++) {
page = reader.getPageN(p);
media = page.getAsArray(PdfName.CROPBOX);
if (media == null) {
media = page.getAsArray(PdfName.MEDIABOX);
}
float llx = media.getAsNumber(0).floatValue() + 200;
float lly = media.getAsNumber(1).floatValue() + 200;
float w = media.getAsNumber(2).floatValue() - media.getAsNumber(0).floatValue() - 400;
float h = media.getAsNumber(3).floatValue() - media.getAsNumber(1).floatValue() - 400;
String command = String.format(
"\nq %.2f %.2f %.2f %.2f re W n\nq\n",
llx, lly, w, h);
stamper.getUnderContent(p).setLiteral(command);
stamper.getOverContent(p).setLiteral("\nQ\nQ\n");
}
stamper.close();
reader.close();
}
Obviously, you need to study this code before using it. Once you understand this code, you'll know that this code will only work for pages that aren't rotated. If you understand the code well, you should have no problem adapting the example for rotated pages.
Update
The re operator constructs a rectangle. It takes four parameters (the values preceding the operator) that define a rectangle: the x coordinate of the lower-left corner, the y coordinate of the lower-left corner, the width and the height.
The W operator sets the clipping path. We have just drawn a rectangle; this rectangle will be used to clip the content that follows.
The n operator starts a new path. It discards the paths we've constructed so far. In this case, it prevents that the rectangle we have drawn (and that we use as clipping path) is actually drawn.
The q and Q operators save and restore the graphics state stack, but that's rather obvious.
All of this is explained in ISO-32000-1 (available online if you Google well) and in the book The ABC of PDF.

unable to calculate itext PdfPTable/PdfPCell height properly

I'm facing a problem while trying to generate a PdfPTable and calculate its height before adding it to a document. The method calculateHeights of PdfPTable returned the height a lot greater than the height of a page (while the table is about 1/4 of page's height), so I wrote a method to calculate the height:
protected Float getVerticalSize() throws DocumentException, ParseException, IOException {
float overallHeight=0.0f;
for(PdfPRow curRow : this.getPdfObject().getRows()) {
float maxHeight = 0.0f;
for(PdfPCell curCell : curRow.getCells()) {
if(curCell.getHeight()>maxHeight) maxHeight=curCell.getHeight();
}
overallHeight+=maxHeight;
}
return overallHeight;
}
where getPdfObject method returns a PdfPTable object.
Using debugger I've discovered that lly and ury coordinate difference (and thus the height) of cell's rectangle is much bigger than it looks after adding a table to a document (for example, one cell is 20 and the other is 38 height while they look like the same on a page). There is nothing in the cell except a paragraph with a chunk in it:
Font f = getFont();
if (f != null) {
int[] color = getTextColor();
if(color != null) f.setColor(color[0],color[1],color[2]);
ch = new Chunk(celltext, f);
par = new Paragraph(ch);
}
cell = new PdfPCell(par);
cell.setHorizontalAlignment(getHorizontalTextAlignment());
cell.setVerticalAlignment(getVerticalTextAlignment());
A table then has a cell added and setWidthPercentage attribute set to a some float.
What am I doing wrong? Why does cell's proportions are different from those I see after generating PDF? Maybe I'm calculating the height wrong? Isn't it the height of a cell on a PDF page should strictly be the difference between lly and ury coordinates
Sorry I haven't shown the exact code, because the PDF is being generated of XML using lots of intermediate steps and objects and it is not very useful "as is" I guess...
Thanks in advance!
The height of table added to a page where the available width is 400 is different from the height of a table added to a page where the available width is 1000. There is no way you can measure the height correctly until the width is defined.
Defining the width can be done by adding the table to the document. Once the table is rendered, the total height is known.
If you want to know the height in advance, you need to define the width in advance. For instance by using:
table.setTotalWidth(400);
table.setLockedWidth(true);
This is explained in the TableHeight example. In table_height.pdf, you see that iText returns a height of 0 before adding a table and a height of 48 after adding the table. iText initially returns 0 because there is no way to determine the actual height.
We then take the same table and we define a total width of 50 (which is much smaller than the original 80% of the available width on the page). Now when we calculate the height of the table with the same contents, iText returns 192 instead of 48. When you look at the table on the page, the cause of the difference in height is obvious.
Inorder to get dynamic table height we should set and lock width of table.
Here, 595 is A4 size paper width.
table.setTotalWidth(595);
table.setLockedWidth(true);

Categories

Resources