We are planning to migrate our pdf generation utilities from iText to PDFBox (Due to licensing issues in iText). With some effort, I was able to write and position text, draw lines etc. But creating Tables with text embedded in Table cells is a challenge, I went through the documentation, examples, Google, Stackoverflow couldn't find a thing. Was wondering if PDFBox provides native support for creating Tables with embedded text. My last resort would be to use this link https://github.com/eduardohl/Paginated-PDFBox-Table-Sample
Since I also needed table drawing functionality for a side project, I implemented a small "table drawer" library myself, which I uploaded to github.
In order to produce such a table – for instance – ...
... you would need this code.
In the same file you find the code for that table as well:
The current "feature list" includes:
set font and font size on table level as well as on cell level
define single cells with bottom-, top-, left- and right-border width separately
define the background color on row or cell level
define padding (top, bottom, left, right) on cell level
define border color (on table, row or cell level)
specify text alignment (vertical and horizontal)
cell spanning and row spanning
text wrapping and line spacing
Also it should not be too hard to add missing stuff like having different border colors for borders on top, bottom, left and right-borders, if needed.
Thanks to the links provided by Tilman. Using the boxable API (https://github.com/dhorions/boxable) I was able to create the table I wanted to. Just an FYI I wanted to create the table with variable number of cells. For example row 1 would have 2 cells, row 2 could have 5 cells and row 3 could have just 3 cells. I was able to do with ease. I followed Example1.java in the link mentioned above.
This example code works for me. I think this would be helpful to you
public static void creteTablePdf() throws IOException {
PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage(page);
int pageWidth = (int)page.getTrimBox().getWidth(); //get width of the page
int pageHeight = (int)page.getTrimBox().getHeight(); //get height of the page
PDPageContentStream contentStream = new PDPageContentStream(document,page);
contentStream.setStrokingColor(Color.DARK_GRAY);
contentStream.setLineWidth(1);
int initX = 50;
int initY = pageHeight-50;
int cellHeight = 20;
int cellWidth = 100;
int colCount = 3;
int rowCount = 3;
for(int i = 1; i<=rowCount;i++){
for(int j = 1; j<=colCount;j++){
if(j == 2){
contentStream.addRect(initX,initY,cellWidth+30,-cellHeight);
contentStream.beginText();
contentStream.newLineAtOffset(initX+30,initY-cellHeight+10);
contentStream.setFont(PDType1Font.TIMES_ROMAN,10);
contentStream.showText("Dinuka");
contentStream.endText();
initX+=cellWidth+30;
}else{
contentStream.addRect(initX,initY,cellWidth,-cellHeight);
contentStream.beginText();
contentStream.newLineAtOffset(initX+10,initY-cellHeight+10);
contentStream.setFont(PDType1Font.TIMES_ROMAN,10);
contentStream.showText("Dinuka");
contentStream.endText();
initX+=cellWidth;
}
}
initX = 50;
initY -=cellHeight;
}
contentStream.stroke();
contentStream.close();
document.save("C:\\table.pdf");
document.close();
System.out.println("table pdf created");
}
Related
I am using PdfExplicitDestination as a page number, for titles by reading the existing pdf content from the page,
but I need to point the focus on specific text content while click on the bookmark.
for (int page = 1; page <= pdf.getNumberOfPages(); page++) {
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
String currentText = PdfTextExtractor.getTextFromPage(pdf.getPage(page), strategy);
if (currentText.contains("title")) {
k.addDestination(PdfExplicitDestination.createXYZ(pdf.getPage(page), pdf.getPage(page).getPageSize().getLeft(), pdf.getPage(page).getPageSize().getTop(), 0));
//System.out.println(currentText);
}
}
I need to find the position of the title in the pdf page to set "float top" value.
PdfExplicitDestination.createXYZ(pageNum, left, top, zoom)
Can any one please help to get it from the existing content in the pdf.
This task can be approached in a number of ways. One of the way is to go over page content in "stripes" (rectangles with small height), and only consider content from such a small rectangle at a time. If you find a text piece in such rectangle then you know that somewhere between upper and lower bound of Y position given by the rectangle coordinates lies the desired text content. You can e.g. create the destination to point to the topmost coordinate in that case - it might be a bit above the desired text but the difference will be small depending on the rectangle height you select.
The following code snipped contains example implementation of the presented idea. There are two parameters - windowHeight which must be tall enough to fit a piece of content you are looking for, but the smaller this variable is the better accuracy you get in the result. Parameter step defines how many such rectangles of height windowHeight we will try on each page. The smaller the parameter is the better accuracy you get, but bigger parameter values optimize performance. Up to a specific use case to tweak those trade-offs.
final float windowHeight = 30;
final float step = 10;
for (int page = 1; page <= pdf.getNumberOfPages(); page++) {
Rectangle pageSize = pdf.getPage(page).getPageSize();
for (float upperPoint = pageSize.getHeight(); upperPoint > 0; upperPoint -= step) {
IEventFilter filter = new TextRegionEventFilter(new Rectangle(0, upperPoint - windowHeight, pageSize.getWidth(), windowHeight));
LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
FilteredTextEventListener listener = new FilteredTextEventListener(strategy, filter);
new PdfCanvasProcessor(listener).processPageContent(pdf.getPage(page));
if (strategy.getResultantText().contains("title")) {
float top = upperPoint; // This is the topmost point of the rectangle
break; // Break here not to capture same text twice
}
}
}
Edit: This is probably a general Excel issue, I'm tracking that here: https://superuser.com/questions/1457518/adding-images-to-excel-that-obey-both-filtering-and-sorting-rules
I am generating worksheets where some rows will have an image embedded with them. Depending on how I embed the image, the images do not hide when the rest of their row's data is hidden OR the images do not sort when the worksheet is sorted.
Example application demonstrating this issue: https://github.com/dan-kirberger/poi-excel-image-issue - it generates two worksheets. Each demonstrating one of my issues. There is also an examples folder with pre-generated worksheets if you would prefer to just look at the resulting workbooks.
The worksheet looks like this before any sorting/filtering is applied:
Sorting/filtering are enabled on the worksheet via:
sheet.setAutoFilter(new CellRangeAddress(sheet.getFirstRowNum(), sheet.getLastRowNum(), 0, 2));
The code (also in the above github link) that adds the image:
Drawing drawing = cell.getSheet().createDrawingPatriarch();
XSSFClientAnchor anchor = new XSSFClientAnchor();
anchor.setAnchorType(imageAnchorType);
anchor.setCol1(cell.getColumnIndex());
anchor.setRow1(cell.getRowIndex());
Picture picture = drawing.createPicture(anchor, pictureId);
picture.resize(1, 1);
In that snippet, imageAnchorType is the deciding factor, if set to MOVE_AND_RESIZE, the images do not get sorted when using the sort functionality in the filters:
Notice that the images no longer match the "Text" column. (The image with a picture of "1" is now next to the text of "Two")
If imageAnchorType is set to MOVE_DONT_RESIZE the images sort appropriately, but when applying filters that remove image rows, the images remain:
We applied a filter to show "Text only" columns, so the "One" and "Three" row data is gone, but their images remain.
Are there any other properties I should be setting to get this to work the way I want?
The problem is not only the anchor type. To provide both, sorting as well as filtering, ClientAnchor.AnchorType.MOVE_AND_RESIZE is correct. For sorting moving must be possible and for filtering resizing must be possible (row height of not visible rows is 0).
But to support sorting, the pictures also must fit into the cells which are sorted. They must not jut out the cells size, because else they will not be sorted together with the cells. So picture.resize is not possible because the resizing resizes the picture to it's native size which probably will be bigger than the cell size of the cell the picture is anchored to.
ClientAnchor provides following settings:
setCol1 which is the first column the anchor is anchored on. The picture's top left edge starts on left edge of that column.
setDx1 which is the value added to left edge of first column the anchor is anchored on. It shifts the picture horizontally away from left edge of first column.
setRow1 which is the first row the anchor is anchored on. The picture's top left edge starts on top edge of that row.
setDy1 which is the value added to top edge of first row the anchor is anchored on. It shifts the picture vertically away from top edge of first row.
setCol2 which is the second column the anchor is anchored on. The picture's bottom right edge ends on left edge of that column.
setDx2 which is the value added to left edge of second column the anchor is anchored on. It shifts bottom right edge of the picture horizontally away from left edge of second column. This will widen the picture horizontally.
setRow2 which is the second row the anchor is anchored on. The picture's bottom right edge ends on top edge of that row.
setDy1 which is the value added to top edge of second row the anchor is anchored on. It shifts bottom right edge of the picture vertically away from top edge of second row. This will stretch the picture vertically.
To support sorting, Row1 and Row2 must be the same row. So that while sorting that row, the picture belongs to that row. This means the pictures height only can be determined by Dy2. And the pictures height must fit into the row height.
Following code shows an example. The pictures I have downloaded from your github.
Code:
import java.io.InputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.poi.util.IOUtils;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.ClientAnchor;
import org.apache.poi.ss.util.CellRangeAddress;
import org.apache.poi.util.Units;
import org.apache.poi.xssf.usermodel.*;
class CreateExcelPictures {
static String excelPath = "ExcelWithPictures.xlsx";
static String[][] data = new String[][]{
new String[]{"Image", "Text", "Type"},
new String[]{"", "One", "One and Three"},
new String[]{"", "Two", "Two only"},
new String[]{"", "Three", "One and Three"}
};
static String[] pictureFileNames = new String[]{"one.png", "two.png", "three.png"};
static int pictureWidthPx = 30;
static int pictureHeightPx = 25;
static XSSFWorkbook workbook;
static XSSFSheet sheet;
static void addImage(int col1, int row1, int col2, int row2,
int dx1, int dy1, int dx2, int dy2,
String imageFileName, ClientAnchor.AnchorType anchorType) throws Exception {
InputStream imageInputStream = new FileInputStream(imageFileName);
byte[] bytes = IOUtils.toByteArray(imageInputStream);
int pictureId = workbook.addPicture(bytes, Workbook.PICTURE_TYPE_PNG);
imageInputStream .close();
XSSFClientAnchor anchor = workbook.getCreationHelper().createClientAnchor();
anchor.setAnchorType(anchorType);
// set Col1, Dx1, Row1, Dy1, Col2, Dx2, Row2, Dy2
// only this determines the picture's size then
anchor.setCol1(col1);
anchor.setDx1(dx1);
anchor.setRow1(row1);
anchor.setDy1(dy1);
anchor.setCol2(col2);
anchor.setDx2(dx2);
anchor.setRow2(row2);
anchor.setDy2(dy2);
XSSFDrawing drawing = sheet.createDrawingPatriarch();
XSSFPicture picture = drawing.createPicture(anchor, pictureId);
}
public static void main(String args[]) throws Exception {
workbook = new XSSFWorkbook();
sheet = workbook.createSheet();
int r = 0;
for (String[] rowData : data) {
XSSFRow row = sheet.createRow(r);
int c = 0;
for (String cellData : rowData) {
XSSFCell cell = row.createCell(c++);
cell.setCellValue(cellData);
}
if (r > 0) {
float rowHeight = (float)Units.pixelToPoints(pictureHeightPx); // picture's height must fit into row height
row.setHeightInPoints(rowHeight);
addImage(0, r, 0, r, /*all fits in one cell*/
/*Dx1 = 0 and Dy1 = 0, picture's top left edge starts on top left of the cell*/
Units.pixelToEMU(0), Units.pixelToEMU(0),
/*Dx2 is picture's width and Dy2 is picture's height, picture's bottom right edge ends on that point into the cell*/
Units.pixelToEMU(pictureWidthPx), Units.pixelToEMU(pictureHeightPx),
pictureFileNames[r-1], ClientAnchor.AnchorType.MOVE_AND_RESIZE);
}
r++;
}
sheet.setColumnWidth(2, 15*256);
sheet.setAutoFilter(new CellRangeAddress(0, 3, 0, 2));
FileOutputStream fos = new FileOutputStream(excelPath);
workbook.write(fos);
fos.close();
workbook.close();
}
}
Result:
Sorting as well as filtering are possible.
I am working with PDF shrinking and then watermarking it and for the same I am using itextpdf-5.5.1.jar. Here is the code which I use to shrink PDF. In code xPercentage and xPercentage value is 0.9f. When I shrink PDF having content table , content on the page is shrinking properly. When I go to table of content the bounding box of hyperlink is getting misplaced. I noticed that size of bounding box is same for Original and shrink output document. How do I shrink bounding box of hyperlink with respect to content?
public void shrinkPDF(String strFilePath , String strFileName) throws Exception{
PdfReader reader = new PdfReader(strFilePath+"//"+strFileName);
PdfStamper stamper = new PdfStamper(reader, new
FileOutputStream(strFilePath+"//Shrink_"+strFileName));
int n = reader.getNumberOfPages();
Map mpPDFLayer = stamper.getPdfLayers();
for (int p = 1; p <= n; p++) {
float offsetX = (reader.getPageSize(p).getWidth() * (1 - xPercentage)) / 2;
float offsetY = (reader.getPageSize(p).getHeight() * (1 - yPercentage)) / 2;
stamper.getUnderContent(p).setLiteral(
String.format("\nq %s 0 0 %s %s %s cm\nq\n",
xPercentage, yPercentage, offsetX, offsetY));
stamper.getOverContent(p).setLiteral("\nQ\nQ\n");
}
stamper.close();
reader.close();
}
Your code shrinks only the content but it does not accordingly move and shrink annotations. So what you have to do additionally is to iterate over the annotations of each page and shrink them.
This in particular means that you have to shrink and move the Rect annotation rectangle. Depending on the nature of the respective annotation, though, there also are other coordinate values in them, e.g. the QuadPoints in case of a link or the L endpoint coordinates of a line.
BTW, your content shrinking code makes assumptions on the origin of the user space coordinate system; it appears to assume that the origin is in the lower left of the crop box and that the crop box and the media box coincide.
I am working on a project where as part of statements I need to attach arbitrary PDF files. These PDF files need to be marked by a title and page numbering, in the top-right corner of the PDF file. This is a legal requirement as these attachments are referred to by their title and total number of pages from the statements.
I (naively) hacked together some code that appears to be working on PDF files with pages in the Portrait orientation (at least the PDF files I tested with). However when I use this code on pages in a Landscape orientation, the title and numbering isn't visible.
The code:
PdfContentByte canvas = pdfStamper.getOverContent( pageNr );
Phrase phrase = new Phrase( sb.toString( ), new Font( FontFamily.HELVETICA, 9f ) ); // sb holds title + page numbering
float width = ColumnText.getWidth( phrase );
ColumnText.showTextAligned ( // draw text top-right
canvas,
Element.ALIGN_LEFT,
phrase,
canvas.getPdfDocument( ).right( ) - width, //x
canvas.getPdfDocument( ).top( ) + 9, //y
0 //rotation
);
Examples:
Portrait where it appears to work:
Landscape where it doesn't work:
Questions:
Where did I go wrong?
Is it possible to write such a piece of code that does it right for all possible page orientations?
If so, how?
You are adding the content, but you are adding it at the wrong place. See PageSize of PDF always the same between landscape and portrait with itextpdf
Let's assume that you are working with an A4 page using portrait orientation. That pages measures 595 by 842 user units. 595 is the width; 842 is the height.
Now let's switch to landscape. This can be done in two different ways:
define a width of 595 and a height of 842, and a rotation of 90 degrees.
define a width of 842 and a height of 595.
Which way is used to define the landscape orientation will have an impact on the value of the right() and top() method. I am pretty sure that you are adding the header to the landscape pages, but you are adding them outside the visible area of the page.
For those interested, I ended up doing it as follows. This works for both Portrait and Landscape orientations. This uses the PdfReader.getPageSizeWithRotation method to get the proper page size.
private String pageText(int pageNr, int pageTotal) {
return ""; // generate string to display top-right of PDF here
}
private void addDocumentObjects(int pageNr, PdfReader pdfReader, PdfStamper pdfStamper) {
final float pageMargin = 25f;
final float textSize = 9f;
final float lineMargin = 5f;
Phrase phrase = new Phrase (
pageText(pageNr, pdfReader.getNumberOfPages()),
new Font(FontFamily.HELVETICA, textSize)
);
final float phraseWidth = ColumnText.getWidth(phrase);
PdfContentByte canvas = pdfStamper.getOverContent(pageNr);
com.itextpdf.text.Rectangle pageRectangle = pdfReader.getPageSizeWithRotation(pageNr);
// draw white background rectangle before adding text + line
canvas.setColorFill(BaseColor.WHITE);
canvas.rectangle (
pageRectangle.getRight(pageMargin) - phraseWidth, //x
pageRectangle.getTop(pageMargin), //y
phraseWidth, // width
textSize + lineMargin //height
);
canvas.fill();
// draw text top right
canvas.setColorFill(BaseColor.BLACK);
ColumnText.showTextAligned (
canvas, //canvas
Element.ALIGN_LEFT, //alignment
phrase, //phrase
pageRectangle.getRight(pageMargin) - phraseWidth, //x
pageRectangle.getTop(pageMargin), //y
0 //rotation
);
// draw line under text
canvas.setColorStroke(BaseColor.BLACK);
canvas.setLineWidth(1);
canvas.moveTo (
pageRectangle.getRight(pageMargin) - phraseWidth, //x
pageRectangle.getTop(pageMargin) - lineMargin //y
);
canvas.lineTo (
pageRectangle.getRight(pageMargin), //x
pageRectangle.getTop(pageMargin) - lineMargin //y
);
canvas.stroke();
}
I'm facing a problem while trying to generate a PdfPTable and calculate its height before adding it to a document. The method calculateHeights of PdfPTable returned the height a lot greater than the height of a page (while the table is about 1/4 of page's height), so I wrote a method to calculate the height:
protected Float getVerticalSize() throws DocumentException, ParseException, IOException {
float overallHeight=0.0f;
for(PdfPRow curRow : this.getPdfObject().getRows()) {
float maxHeight = 0.0f;
for(PdfPCell curCell : curRow.getCells()) {
if(curCell.getHeight()>maxHeight) maxHeight=curCell.getHeight();
}
overallHeight+=maxHeight;
}
return overallHeight;
}
where getPdfObject method returns a PdfPTable object.
Using debugger I've discovered that lly and ury coordinate difference (and thus the height) of cell's rectangle is much bigger than it looks after adding a table to a document (for example, one cell is 20 and the other is 38 height while they look like the same on a page). There is nothing in the cell except a paragraph with a chunk in it:
Font f = getFont();
if (f != null) {
int[] color = getTextColor();
if(color != null) f.setColor(color[0],color[1],color[2]);
ch = new Chunk(celltext, f);
par = new Paragraph(ch);
}
cell = new PdfPCell(par);
cell.setHorizontalAlignment(getHorizontalTextAlignment());
cell.setVerticalAlignment(getVerticalTextAlignment());
A table then has a cell added and setWidthPercentage attribute set to a some float.
What am I doing wrong? Why does cell's proportions are different from those I see after generating PDF? Maybe I'm calculating the height wrong? Isn't it the height of a cell on a PDF page should strictly be the difference between lly and ury coordinates
Sorry I haven't shown the exact code, because the PDF is being generated of XML using lots of intermediate steps and objects and it is not very useful "as is" I guess...
Thanks in advance!
The height of table added to a page where the available width is 400 is different from the height of a table added to a page where the available width is 1000. There is no way you can measure the height correctly until the width is defined.
Defining the width can be done by adding the table to the document. Once the table is rendered, the total height is known.
If you want to know the height in advance, you need to define the width in advance. For instance by using:
table.setTotalWidth(400);
table.setLockedWidth(true);
This is explained in the TableHeight example. In table_height.pdf, you see that iText returns a height of 0 before adding a table and a height of 48 after adding the table. iText initially returns 0 because there is no way to determine the actual height.
We then take the same table and we define a total width of 50 (which is much smaller than the original 80% of the available width on the page). Now when we calculate the height of the table with the same contents, iText returns 192 instead of 48. When you look at the table on the page, the cause of the difference in height is obvious.
Inorder to get dynamic table height we should set and lock width of table.
Here, 595 is A4 size paper width.
table.setTotalWidth(595);
table.setLockedWidth(true);