Tables in PDF with horizontal page breaks

Tables in PDF with horizontal page breaks - java

Does someone know a (preferably open-source) PDF layout engine for Java, capable of rendering tables with horizontal page breaks? "Horizontal page breaking" is at least how the feature is named in BIRT, but to clarify: If a table has too many columns to fit across the available page width, I want the table to be split horizontally across multiple pages, e.g. for a 10-column table, the columns 1-4 to be output on the first page and columns 5-10 on the second page. This should of course also be repeated on the following pages, if the table has too many rows to fit vertically on one page.
So far, it has been quite difficult to search for products. I reckon that such a feature may be named differently in other products, making it difficult to use aunt Google to find a suitable solution.
So far, I've tried:
BIRT claims to support this, but the actual implementation is so buggy, that it cannot be used. I though it is self-evident for such a functionality, that the row height is kept consistent across all pages, making it possible to align the rows when placing the pages next to each other. BIRT however calculates the required row height separately for each page.
Jasper has no support.
I also considered Apache FOP, but I don't find any suitable syntax for this in the XSL-FO specification.
iText is generally a little bit too "low level" for this task anyway (making it difficult to layout other parts of the intended PDF documents), but does not seem to offer support.
Since there seem to be some dozens other reporting or layout engines, which may or may not fit and I find it a little bit difficult to guess exactly what to look for, I was hoping that someone perhaps already had similar requirements and can provide at least a suggestion in the right direction. It is relatively important that the product can be easily integrated in a Java server application, a native Java library would be ideal.
Now, to keep the rows aligned across all pages, the row heights must be calculated as follows:
Row1.height = max(A1.height, B1.height, C1.height, D1.height)
Row2.height = max(A2.height, B2.height, C2.height, D2.height)
While BIRT currently seem to do something like:
Page1.Row1.height = max(A1.height, B1.height)
Page2.Row1.height = max(C1.height, D1.height)
Page1.Row2.height = max(A2.height, B2.height)
Page2.Row2.height = max(C2.height, D2.height)

It's possible to display a table the way you want with iText. You need to use custom table positioning and custom row and column writing.
I was able to adapt this iText example to write on multiple pages horizontally and vertically. The idea is to remember the start and end row that get in vertically on a page. I've put the whole code so you can easily run it.
public class Main {
public static final String RESULT = "results/part1/chapter04/zhang.pdf";
public static final float PAGE_HEIGHT = PageSize.A4.getHeight() - 100f;
public void createPdf(String filename)
throws IOException, DocumentException {
// step 1
Document document = new Document();
// step 2
PdfWriter writer
= PdfWriter.getInstance(document, new FileOutputStream(filename));
// step 3
document.open();
//setup of the table: first row is a really tall one
PdfPTable table = new PdfPTable(new float[] {1, 5, 5, 1});
StringBuilder sb = new StringBuilder();
for(int i = 0; i < 50; i++) {
sb.append("tall text").append(i + 1).append("\n");
}
for(int i = 0; i < 4; i++) {
table.addCell(sb.toString());
}
for (int i = 0; i < 50; i++) {
sb = new StringBuilder("some text");
table.addCell(sb.append(i + 1).append(" col1").toString());
sb = new StringBuilder("some text");
table.addCell(sb.append(i + 1).append(" col2").toString());
sb = new StringBuilder("some text");
table.addCell(sb.append(i + 1).append(" col3").toString());
sb = new StringBuilder("some text");
table.addCell(sb.append(i + 1).append(" col4").toString());
}
// set the total width of the table
table.setTotalWidth(600);
PdfContentByte canvas = writer.getDirectContent();
ArrayList<PdfPRow> rows = table.getRows();
//check every row height and split it if is taller than the page height
//can be enhanced to split if the row is 2,3, ... n times higher than the page
for (int i = 0; i < rows.size(); i++) {
PdfPRow currentRow = rows.get(i);
float rowHeight = currentRow.getMaxHeights();
if(rowHeight > PAGE_HEIGHT) {
PdfPRow newRow = currentRow.splitRow(table,i, PAGE_HEIGHT);
if(newRow != null) {
rows.add(++i, newRow);
}
}
}
List<Integer[]> chunks = new ArrayList<Integer[]>();
int startRow = 0;
int endRow = 0;
float chunkHeight = 0;
//determine how many rows gets in one page vertically
//and remember the first and last row that gets in one page
for (int i = 0; i < rows.size(); i++) {
PdfPRow currentRow = rows.get(i);
chunkHeight += currentRow.getMaxHeights();
endRow = i;
//verify against some desired height
if (chunkHeight > PAGE_HEIGHT) {
//remember start and end row
chunks.add(new Integer[]{startRow, endRow});
startRow = endRow;
chunkHeight = 0;
i--;
}
}
//last pair
chunks.add(new Integer[]{startRow, endRow + 1});
//render each pair of startRow - endRow on 2 pages horizontally, get to the next page for the next pair
for(Integer[] chunk : chunks) {
table.writeSelectedRows(0, 2, chunk[0], chunk[1], 236, 806, canvas);
document.newPage();
table.writeSelectedRows(2, -1, chunk[0], chunk[1], 36, 806, canvas);
document.newPage();
}
document.close();
}
public static void main(String[] args) throws IOException, DocumentException {
new Main().createPdf(RESULT);
}
}
I understand that maybe iText is too low level just for reports, but it can be employed beside standard reporting tools for special needs like this.
Update: Now rows taller than page height are first splited. The code doesn't do splitting if the row is 2, 3,..., n times taller but can be adapted for this too.

Same idea here than Dev Blanked but using wkhtmltopdf (https://code.google.com/p/wkhtmltopdf/) and some javascript, you can achieve what you need. When running wkhtmltopdf against this fiddle you get the result shown below (screenshot of pdf pages). You can place the "break-after" class anywhere on the header row. We use wkhtmltopdf server-side in a Java EE web app to produce dynamic reports and the performance is actually very good.
HTML
<body>
<table id="table">
<thead>
<tr><th >Header 1</th><th class="break-after">Header 2</th><th>Header 3</th><th>Header 4</th></tr>
</thead>
<tbody>
<tr valign="top">
<td>A1<br/>text<br/>text</td>
<td>B1<br/>text</td>
<td>C1</td>
<td>D1</td>
</tr>
<tr valign="top">
<td>A2</td>
<td>B2<br/>text<br/>text<br/>text</td>
<td>C2</td>
<td>D2<br/>text</td>
</tr>
</tbody>
</table>
</body>
Script
$(document).ready(function() {
var thisTable = $('#table'),
otherTable= thisTable.clone(false, true),
breakAfterIndex = $('tr th', thisTable).index($('tr th.break-after', thisTable)),
wrapper = $('<div/>');
wrapper.css({'page-break-before': 'always'});
wrapper.append(otherTable);
thisTable.after(wrapper);
$('tr', thisTable).find('th:gt(' + breakAfterIndex + ')').remove();
$('tr', thisTable).find('td:gt(' + breakAfterIndex + ')').remove();
$('tr', otherTable).find('th:lt(' + (breakAfterIndex + 1) + ')').remove();
$('tr', otherTable).find('td:lt(' + (breakAfterIndex + 1) + ')').remove();
$('tr', table).each(function(index) {
var $this =$(this),
$otherTr = $($('tr', otherTable).get(index)),
maxHeight = Math.max($this.height(), $otherTr.height());
$this.height(maxHeight);
$otherTr.height(maxHeight);
});
});

Have you tried http://code.google.com/p/flying-saucer/. It is supposed to convert HTML to PDF.

My advice is to use FOP transformer.
Here you can see some examples and how to use it.
Here you can find some examples with fop and tables.

Jasper has no support.
According to the Jasper documentation it does have support, via:
column break element (i.e. a break element with a type=column attribute). This can be placed at any location in a report.
isStartNewColumn attribute on groups/headers
See http://books.google.com.au/books?id=LWTbssKt6MUC&pg=PA165&lpg=PA165&dq=jasper+reports+%22column+break%22&source=bl&ots=aSKZfqgHR5&sig=KlH4_OiLP-cNsBPGJ7yzWPYgH_k&hl=en&sa=X&ei=h_1kUb6YO6uhiAeNk4GYCw&redir_esc=y#v=onepage&q=column%20break&f=false
If you're really stuck, as a last resort you could use Excel / OpenOffice Calc: manually copy data into cells, manually format it as you desire, save as xls format. Then use apache POI from java to dynamically populate/replace the desired data & print to file/PDF. At least it gives very fine-grained control of column & row formatting/breaks/margins etc.

Related

Loading Excel files in Java takes too much time

I would like to load an Excel file into a Java program, parse it and insert the necessary things into a database every day, but don't want to load the whole file every time when I run the program. I need to get last 90 rows only. Is it possible to load an Excel (XLSM) file partially in Java (not necessary but preferred, can be another programing language too) to decrease loading time?
It takes around 60-70 seconds, and loading Excel takes 35 seconds, Excel file has 4000 rows and rows has 900 columns.
try{
workbook = WorkbookFactory.create(new FileInputStream(file));
sheet = workbook.getSheetAt(0);
rowSize=sheet.getLastRowNum();
myWriter = new FileWriter("/Users/mykyusuf/Desktop/filename.txt");
Row malzeme=sheet.getRow(1);
Row kaynak=sheet.getRow(2);
Row endeks=sheet.getRow(3);
myWriter.write("insert all\n");
Row row=sheet.getRow(rowSize-1);
for (int i = 4; i < rowSize-1; i++) {
row = sheet.getRow(i);
for (Cell cell : row) {
if (cell.getColumnIndex()>3) {
myWriter.write("into piyasa_takip (tarih,malzeme,kaynak,endeks,deger) values (to_date(\'" + row.getCell(3).getLocalDateTimeCellValue().toLocalDate() + "\','YYYY-MM-DD'),\'" + malzeme.getCell(cell.getColumnIndex()) + "\',\'" + kaynak.getCell(cell.getColumnIndex()) + "\',\'" + endeks.getCell(cell.getColumnIndex()) + "\',\'" + cell + "\')\n");
}
}
}
row = sheet.getRow(rowSize-1);
for (Cell cell : row) {
if (cell.getColumnIndex()>3 ) {
myWriter.write("into piyasa_takip (tarih,malzeme,kaynak,endeks,deger) values (to_date(\'" + row.getCell(3).getLocalDateTimeCellValue().toLocalDate() + "\','YYYY-MM-DD'),\'" + malzeme.getCell(cell.getColumnIndex()) + "\',\'" + kaynak.getCell(cell.getColumnIndex()) + "\',\'" + endeks.getCell(cell.getColumnIndex()) + "\',\'" + cell + "\')\n");
}
}
myWriter.write(" Select * from DUAL\n");
myWriter.close();
}

I do not know a simple answer to your question, but I want to help you figure it out
Exist two substantially different formats: *.XLS (old) and *.XLSX (new). In common case, new format more compact (because use zipping as part of "container").
I don't know simple way for "cut" last 90 rows from excel file. Especially, excel have a complicated format with tabs, formulas and hyperlinks (and scripts :-) ) in document.
But, we can use "divide and rule" principle. If you have a big excel file locally and this file wery slow loading on remote host, you can process fiel locally (for extracnt only new reccords in other file) and load to remote host this "modifications" only.
Thus, you divide the task into two parts: super-simple processing of a large file locally (to highlight the changed part) and normal and smart processing on a remote host.
Maybe this will help you?

Maybe you can try to use Free Spire.Xls to solve this.
I choose some data (70 rows and 8 columns ). It costs me 1-2 seconds to read them.
Hope it can help you to save some time.
And codes are right below:
import com.spire.xls.Workbook;
import com.spire.xls.Worksheet;
public class GetCellRange {
public static void main(String[] args) {
//Load the sample document
Workbook workbook = new Workbook();
workbook.loadFromFile("sample.xlsx");
//Get the first worksheet
Worksheet worksheet = workbook.getWorksheets().get(0);
//Choose the output content
for (int row = 1; row <= 70 ; row++) {
for (int col = 1; col <= 8 ; col++) {
System.out.println(worksheet.getCellRange(row,col).getValue() + "\t");
}
System.out.println("\n");
}
}
}

values are getting overwritten in the table -PDFbox

I want to display the set of records in rows and columns. Am getting output but the thing is, it is getting overlapped. should i modify the loop can someone pls suggest.
ArrayList<ResultRecord> Records = new ArrayList<ResultRecord>(MainRestClient.fetchResultRecords(this.savedMainLog));
for(j=0; j<Records.size(); j++)
{
Row<PDPage> row4 = table.createRow(100.0f);
Cell<PDPage> cell10 = row4.createCell(15.0f, temp.getNumber());
Cell<PDPage> cell11 = row4.createCell(45.0f, temp.getDescription());
Cell<PDPage> cell12 = row4.createCell(15.0f, temp.getStatus());
Cell<PDPage> cell13 = row4.createCell(25.0f, temp.getRemarks());
}
The below is the full code for opening a PDF file. I want to retreive set of records in the row4 in the corresponding cells. But the is over written one above the another.
Expected output:
IT should display one below the another.
Is the overlapping reason,is it because of defining the row as row4.
try {
//table.draw();
cell.setFontSize(12);
} catch (Exception e) {
System.out.println(e.getMessage());
}
}

First of all, you should clarify the table drawing library you use. PDFBox only is the underlying PDF library. Considering the classes used I would assume you are using Boxable on top of it.
Furthermore, the reason why all the tables are printed over each other is that you start each table at the same position on the same page, you use
BaseTable table = new BaseTable(yPosition, yStartNewPage,
bottomMargin, tableWidth, margin, document, page, true, drawContent);
without ever changing yPosition or page.
To get one table after the other, you have to update yPosition and page accordingly, e.g. by using the return value of table.draw() and the state of table then, i.e. by replacing
table.draw();
by
yPosition = table.draw();
page = table.getCurrentPage();

PDFBOX - header in all pages using easytable

I am using pdfbox and easytable https://github.com/vandeseer/easytable for creating dynamic pages which works great. But I do want header to be added in alL pages. I faced/tried below things.
1) Tablebuilder is created before writing rows so we can create a perfect tablebuilder since rows are dynamic.
2) Tried to insert header in middle while creating tablebuilder which again is not perfect since TableDrawer makes the rows to suffice according to row height
Any idea/help would be appreciated.
Need output similar to this project - https://github.com/eduardohl/Paginated-PDFBox-Table-Sample . only problem here being the content is not dynamic like easytable.

As an addition to #mkl's answer and its comments: In current versions of the library there is a class of its own for this very requirement.
So your code basically boils down to something like:
try (final PDDocument document = new PDDocument()) {
RepeatedHeaderTableDrawer.builder()
.table(createTable())
.startX(50)
.startY(100F)
.endY(50F) // note: if not set, table is drawn over the end of the page
.build()
.draw(() -> document, () -> new PDPage(PDRectangle.A4), 50f);
document.save("your-awesome-document.pdf");
}

This answer had been written at an earlier time when easytable had not yet supported repeating table headers. Meanwhile it does, see the answer by philonous, the easytable author.
easytable does not support repeating table headers or footers. Not yet I should say because this feature actually is easy to implement.
It is difficult, though, to implement on top of easytable because that library (like many others) suffers from excessive data hiding: many interesting member variables and methods are private, so extending the classes is not a viable option.
But what you can do is handle the header rows as a separate table which you draw again and again! The downside is a bit of duplicity of settings.
In case of the test code TwoPagesTableTest you referred to, it can be changed like this:
final Table.TableBuilder tableHeaderBuilder = Table.builder()
.addColumnOfWidth(200)
.addColumnOfWidth(200);
CellText dummyHeaderCell = CellText.builder()
.text("Header dummy")
.backgroundColor(Color.BLUE)
.textColor(Color.WHITE)
.borderWidth(1F)
.build();
tableHeaderBuilder.addRow(
Row.builder()
.add(dummyHeaderCell)
.add(dummyHeaderCell)
.build());
Table tableHeader = tableHeaderBuilder.build();
final Table.TableBuilder tableBuilder = Table.builder()
.addColumnOfWidth(200)
.addColumnOfWidth(200);
CellText dummyCell = CellText.builder()
.text("dummy")
.borderWidth(1F)
.build();
for (int i = 0; i < 50; i++) {
tableBuilder.addRow(
Row.builder()
.add(dummyCell)
.add(dummyCell)
.build());
}
TableDrawer drawer = TableDrawer.builder()
.table(tableBuilder.build())
.startX(50)
.endY(50F) // note: if not set, table is drawn over the end of the page
.build();
final PDDocument document = new PDDocument();
float startY = 100F;
do {
TableDrawer headerDrawer = TableDrawer.builder()
.table(tableHeader)
.startX(50)
.build();
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
try (PDPageContentStream contentStream = new PDPageContentStream(document, page)) {
headerDrawer.startY(startY);
headerDrawer.contentStream(contentStream).draw();
drawer.startY(startY - tableHeader.getHeight());
drawer.contentStream(contentStream).draw();
}
startY = page.getMediaBox().getHeight() - 50;
} while (!drawer.isFinished());
document.save("twoPageTable-repeatingHeader.pdf");
document.close();
(RepeatingTableHeaders test createTwoPageTableRepeatingHeader)
As you see, the code first creates a separate Table tableHeader containing only the header row. This table then is added first on each page and a part of the body rows table is added thereafter.
The result: table headers on each page...
A word of warning: This is a proof of concept, I have only tested with the table generation code from TwoPagesTableTest. For production code you should apply further tests.

Stop iText table from spliting on new page

I am developing an app for android that generates pdf.
I am using itextpdf to generate the pdf.
I have the following problem:
I have a table that has 3 rows and when this table is near the end of a page sometimes it puts one row on one page and two rows on the next page.
Is there a way to force this table to start on the next page so I can have the full table on the next page?
Thanks

As an alternative to Bruno's approach of nesting the table in a 1-cell table to prevent splitting, you can also use PdfPTable.setKeepTogether(true) to start the table on a new page when it doesn't fit the current page.
Using a similar example:
Paragraph p = new Paragraph("Test");
PdfPTable table = new PdfPTable(2);
for (int i = 1; i < 6; i++) {
table.addCell("key " + i);
table.addCell("value " + i);
}
for (int i = 0; i < 40; i++) {
document.add(p);
}
// Try to keep the table on 1 page
table.setKeepTogether(true);
document.add(table);
Both approaches (nesting in a 1-cell table and using setKeepTogether()) behave exactly the same in my tests. This includes when the table is too large to fit on the new page and still needs to be split, e.g. when adding 50 instead of 5 rows in the example above.

Please take a look at the Splitting example:
Paragraph p = new Paragraph("Test");
PdfPTable table = new PdfPTable(2);
for (int i = 1; i < 6; i++) {
table.addCell("key " + i);
table.addCell("value " + i);
}
for (int i = 0; i < 40; i++) {
document.add(p);
}
document.add(table);
We have a table with 5 rows, and in this case, we're adding some paragraphs so that the table is added at the end of a page.
By default, iText will try not to split rows, but if the full table doesn't fit, it will forward the rows that don't fit to the next page:
You want to avoid this behavior: you don't want the table to split.
Knowing that iText will try to keep full rows intact, you can work around this problem by nesting the table you don' want to split inside another table:
PdfPTable nesting = new PdfPTable(1);
PdfPCell cell = new PdfPCell(table);
cell.setBorder(PdfPCell.NO_BORDER);
nesting.addCell(cell);
document.add(nesting);
Now you get this result:
There was sufficient space on the previous page to render a couple of rows, but as we've wrapped the full table inside a row with a single column, iText will forward the complete table to the next page.

Large table in table cell invoke page break

I have single PdfPTable with single column. One page fit 50 rows. If I add some text data to table (for an example, 300 rows), report work fine. When I add a PdfPTable into cell (for example, 20 string cells, PdfPTable(with 20 rows in it or less), and 270 string cells), all work fine too:
/--------
20 string rows
inner table (20 rows)
10 string rows
/-------
...additional rows
But, when my inner table have more rows (mainTable[20 string rows, innerTable[90 string rows], 270 string rows], report break first page, and start innerTable output on the second page.
/---------
20 string rows
whitespace for 30 rows
/---------
inner table (50 rows from 90)
/---------
inner table (40 rows from 90)
..additional data
And I need this:
/---------
20 string rows
inner table (30 rows from 90)
/---------
inner table (50 rows from 90)
/---------
inner table (10 rows from 80)
..additional data
Anybody know solution?
ps itext version - 2.1.7

Please take a look at the NestedTables2 example. In this example, we tell iText that it's OK to split cells as soon as a row doesn't fit on a page:
This is not the default: by default, iText will try to keep a row intact by forwarding it to the next page. Only if the row doesn't fit the page, the row will be split.
Changing the default involves adding a single line to your code. That line is called: table.setSplitLate(false);
This is the full method that created the PDF shown in the screen shot:
public void createPdf(String dest) throws IOException, DocumentException {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(dest));
document.open();
PdfPTable table = new PdfPTable(2);
table.setSplitLate(false);
table.setWidths(new int[]{1, 15});
for (int i = 1; i <= 20; i++) {
table.addCell(String.valueOf(i));
table.addCell("It is not smart to use iText 2.1.7!");
}
PdfPTable innertable = new PdfPTable(2);
innertable.setWidths(new int[]{1, 15});
for (int i = 0; i < 90; i++) {
innertable.addCell(String.valueOf(i + 1));
innertable.addCell("Upgrade if you're a professional developer!");
}
table.addCell("21");
table.addCell(innertable);
for (int i = 22; i <= 40; i++) {
table.addCell(String.valueOf(i));
table.addCell("It is not smart to use iText 2.1.7!");
}
document.add(table);
document.close();
}
Note that you should not use iText 2.1.7. If you do, you could have a problem with your employer, customer, investor. Read more about this in the legal section of the free ebook The Best iText Questions on StackOverflow

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.