Apache POI seeing columns in empty spreadsheet? - java

I have an empty spreadsheet, but when I'm accessing it with Apache POI (version 3.10), it says it has 1024 columns and 20 physical columns.
I really deleted all the cells, only some formatting remains, but no content.
And if I delete some columns with LibreOffice Calc (version 4.1.3.2), the number of columns only increases! What's going on?
Is there a reliable way to get the real number of columns (or cells in a row)?
import java.net.URL;
import org.apache.poi.ss.usermodel.*;
public class Test {
public static void main(final String... args) throws Exception {
final URL url = new URL("http://aditsu.net/empty.xlsx");
final Workbook w = WorkbookFactory.create(url.openStream());
final Row r = w.getSheetAt(0).getRow(0);
System.out.println(r.getLastCellNum());
System.out.println(r.getPhysicalNumberOfCells());
}
}

After some more investigation, I think I figured out what's happening.
First, some terminology from POI: there are some cells that don't actually exist at all in the spreadsheet - those are called missing, or undefined/not defined. Then there are some cells that are defined, but have no value - those are called blank cells. Both types of cells appear empty in a spreadsheet program and can't be distinguished visually.
My spreadsheet has some blank cells that LibreOffice added at the end of the row (possibly a bug). When I delete columns, LibreOffice seems to shift the subsequent cells (including the blank ones) to the left, and adds more blank cells at the end (up to 1024).
And now the key part: neither getLastCellNum() nor getPhysicalNumberOfCells() ignore blank cells. getLastCellNum() gives the last defined cell, and getPhysicalNumberOfCells() gives the number of defined cells, both including blank cells. There doesn't seem to be any method available that skips blank cells. The javadoc for getPhysicalNumberOfCells() is somewhat misleading - "if only columns 0,4,5 have values then there would be 3", but it's actually counting blank cells too, which don't really have values.
So the only solution I found is to loop through the cells and check if they are blank.
Side note: getLastRowNum() and getFirstCellNum() are 0-based but getLastCellNum() is 1-based, wtf?

Most likely you have some kind of formatting applied for you row. I have an empty xlsx file created with excel and method getRow produces null for empty rows.

#aditsu as per https://poi.apache.org/apidocs/dev/org/apache/poi/ss/usermodel/Row.html, getLastCellNum() gets the index of the last cell contained in this row PLUS ONE.
+1 for libreOffice strugle! it's a bug, and in my opinion is very random. I'm getting null randomly, and often helps if I delete EMPTY rows (bellow) and EMPTY columns (on the right side).
...

Related

Apache Poi cell not returning the correct value

I have a excel file with a cell that generates the number 3.69 (based on calculations from proceeding numbers)
However when pulling that number in java using
if (brightCell.getNumericCellValue()) > 0 )
{
double brightness = brightCell.getNumericCellValue();
return brightness;
}
I've also tried:
if (Double.parseDouble(brightCell.getStringCellValue()) > 0 )
{
double brightness = Double.parseDouble(brightCell.getStringCellValue());
return brightness;
}
brightCell is instantiated with :
brightCell = spreadsheet.getRow(new CellReference(brightString).getRow()).getCell(new CellReference(brightString).getCol());
brightString is String brightString = "BV29"
But with both solutions, brightness receives the value, 3.2133....
So thanks to #Igor I managed to figure it out but it led to more issues.
So the solution was creating an evaluator
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
evaluator.setIgnoreMissingWorkbooks(true); //if you need it
when you finish setting the required cells and want to evaluate.
evaluator.EvaluateAll();
The problem for me is I'm doing this multiple times and my 1st resut is correct but upon the second iteration it becomes skewed, and more skewed.
What I'm doing is setting various cells (via java) then before I retrieve the value for a cell (that contains a formula) I run EvaluateAll. Now, I'm not sure if I should be evaluating after EVERY change or after I make all my changes to the excel sheet (via java).
I can't evaluate a specific cell at a time because there's over 38 sheets with multitudes of formulas. So EvaluateAll is the best option for me
EDIT 26/10/2018*
So the issue was not clearing the cache after making inputs. The solution was after each input as specified in the javaDoc that:
Should be called whenever there are changes to input cells in the evaluated workbook.
Failure to call this method after changing cell values will cause incorrect behaviour
of the evaluate~ methods of this class
therefore after making an input on a cell you should call evaluator.clearAllCachedResultValues();

How to keep track of inserted page numbers or get a row reference when a page number is changing in .docx file using Apache POI

I am creating a Java application in which I am interacting with document file .docx. I am using Apache POI to generate it and modify into it.
The existing file is having tables like:
Tables that are created initially.
I have to add the rows in the first table so that the output is something like: The way the 2 tables should be shown
So this is like I am allowing table 1 to occupy the space till it can go in document page, it is followed by another table. This type of table is needed because after that I am removing its inside borders and it shows something like: how it should be shown with inside borders removed.
I am adding the blank rows using this piece of code..
for (int i = 0; i < itemCount; i++) {
oldRow = table.getRow(i);
newRow = table.insertNewTableRow(i + 1);
for (int j = 0; j < oldRow.getTableCells().size(); j++) {
cell = newRow.createCell();
CTTcPr ctTcPr = cell.getCTTc().addNewTcPr();
ctTcPr.addNewTcBorders().addNewTop().setVal(STBorder.NIL);
ctTcPr.addNewTcBorders().addNewBottom().setVal(STBorder.NIL);
CTTblWidth cellWidth = ctTcPr.addNewTcW();
cellWidth.setType(oldRow.getCell(j).getCTTc().getTcPr().getTcW().getType());
// sets type of width
BigInteger width = oldRow.getCell(j).getCTTc().getTcPr().getTcW().getW();
cellWidth.setW(width); // sets width
if (oldRow.getCell(j).getCTTc().getTcPr().getGridSpan() != null) {
ctTcPr.setGridSpan(oldRow.getCell(j).
getCTTc().getTcPr().getGridSpan()); // sets grid span if any
}
}
}
In few of the lines in starting, I am adding value as something like:
paragraph = row.getCell(0).getParagraphArray(0);
if (paragraph == null) {
paragraph = row.getCell(0).addParagraph();
}
run = setRunAndParagraph(paragraph);
run.setText(itemNames[i]);
All this works nicely, this code was helped by a great coder here who helped me to solve earlier issue.
Now, when I am inserting the rows, I can manually count that for A4 size page, this many rows will be there in table 1, and can iterate likewise. But, the issue is, some item names are wider than column length and as word wrap is on, row for that particular item takes height twice as normal. So, I can get number of rows but not exactly decide how many lines occupied by the items.
The rows appended are for better look and application specific so I thought of an approach that may work:
I am adding page numbers in footer, after inserting a blank row, I am checking whether total page numbers of a document changed or not, if yes that means the code stops there. And as there's 1 excess row I am removing it.
But the issue here arise is I am not able to get page number. I tried
document.getProperties().getExtendedProperties().getUnderlyingProperties().getPages();
But that always shows 1 even if total pages becomes 2. I searched for an approach stating to check form-feed but I think I am having tables here so I don't know it will be helpful approach or not. So I went for adding page number in the page as a field but that also gave 1 every time. Then I tried to do it with footer and this is how I am writing the page number.
XWPFHeaderFooterPolicy p = document.createHeaderFooterPolicy();
XWPFFooter f = p.createFooter(XWPFHeaderFooterPolicy.DEFAULT);
paragraph = f.createParagraph();
//paragraph.createRun().setText("Page: ");
paragraph.createRun();
paragraph.getCTP().addNewFldSimple().setInstr("PAGE \\* MERGEFORMAT");
Now I don't know how to get value of this field, I tried getText() on footer but it did not gave the number. I read that fields are related to paragraph so I tried to get the paragraph from the footer and tried to call getText() but the output is nothing. So can someone help me with this scenario? Right now I am getting the response like this: some part of second table going in next page.
I tried to implement solution I found, but I could not get it resolved yet. so if someone can help me in this scenario like how to fetch this page numbers from the field inserted in the document, it would be really great. I tried to provide more details so that if there's a better approach than this then also someone can suggest the same. Thank you.. :)

POI: setCellType(Cell.CELL_TYPE_FORMULA) fails because of Cell.CELL_TYPE_ERROR

My Java application reads an xls file and presents it on a JTable. So far so good.
When I try to save my worksheet, I iterate over row,col in my JTable and:
String str = (String) Table.getValueAt(row, col);
HSSFRow thisrow = sheet.getRow(row);
HSSFCell thiscell = thisrow.getCell(col);
if(thiscell==null) thiscell = thisrow.createCell(col);
switch(inferType(str)) {
case "formula":
thiscell.setCellType(Cell.CELL_TYPE_FORMULA);
thiscell.setCellFormula(str.substring(1));
break;
case "numeric":
thiscell.setCellType(Cell.CELL_TYPE_NUMERIC);
thiscell.setCellValue(Double.parseDouble(str));
break;
case "text":
thiscell.setCellType(Cell.CELL_TYPE_STRING);
thiscell.setCellValue(str);
break;
}
But when I run over a cell which was originally a formula, say A1/B1, that is #DIV/0! at the moment, setCellType fails.
With much investigation I found out that when setCellType is called, it tries to convert the old content to the new type. BUT, this didn't seem a problem to me, since every table formula cell was already a formula in the xls. Hence, I am never actually changing types.
Even so, when I call setCellType(Cell.CELL_TYPE_FORMULA) on a cell that is already a formula, but it is evaluated to #DIV/0!, I get an conversion exception.
Exception in thread "AWT-EventQueue-0" java.lang.IllegalStateException: Cannot get a numeric value from a error formula cell
at org.apache.poi.hssf.usermodel.HSSFCell.typeMismatch(HSSFCell.java:648)
at org.apache.poi.hssf.usermodel.HSSFCell.checkFormulaCachedValueType(HSSFCell.java:653)
at org.apache.poi.hssf.usermodel.HSSFCell.getNumericCellValue(HSSFCell.java:678)
at org.apache.poi.hssf.usermodel.HSSFCell.setCellType(HSSFCell.java:317)
at org.apache.poi.hssf.usermodel.HSSFCell.setCellType(HSSFCell.java:283)
Actually my only workaround is, before setCellType:
if(thiscell.getCachedFormulaResultType()==Cell.CELL_TYPE_ERROR)
thiscell = thisrow.createCell(col);
This IS working, but I lose the original layout of the cell, e.g. its colors.
How can I properly setCellType if the Cell is a formula with evaluation error?
I found this in the mailing list of poi-apache:
There are two possible scenarios when setting value for a formula
cell;
Update the pre-calculated value of the formula. If a cell contains formula then cell.setCellValue just updates the pre-calculated
(cached) formula value, the formula itself remains and the cell type
is not changed
Remove the formula and change the cell type to String or Number:
cell.setCellFormula(null); //Remove the formula
then cell.setCellValue("I changed! My type is CELL_TYPE_STRING now"");
or cell.setCellValue(200); //NA() is gone, the real value is 200
I think we can improve cell.setCellValue for the case (1). If the new
value conflicts with formula type then IllegalArgumentException should
be thrown.
Regards, Yegor
Still, it does feel like a workaround to me. But everything is now working.
cell.setCellFormula(null) before any setCellType should prevent conversion failure, because the first will discard the cached content.

apache poi - reading comments from blank and missing (null) cells

I'm trying to read comments from all Excel documents cell's (using Apache POI).
I have problem when empty (or missing) cells contains comments.
Currently only solutions that I found is to:
iterate every row to last not empty column
get all (even empty) cells
check if cell's comment is not empty
if true: handle comment
Some code:
if (row != null) {
cell = row.getCell(cellNum, Row.CREATE_NULL_AS_BLANK);
cellComment = cell.getCellComment();
if (cellComment != null)
...
}
Main problems is that I can't read comments from empty lines and comments which are after last not empty cell.
Increasing performance (comparing to reading all row cells) would be nice, but main point is to read ALL documents comments.
You can read the comments of Blank Cell or Null cell using missingCellPolicy, row.getCell(int cellnum, MissingCellPolicy policy) which allows you to deal with the cells which are blank or null.
For example in your sheet, say the 7th row is blank and its 5th col have some comment (say "Hello"), and you need to read that comment. just do the following:
Comment comment = sheet.getRow(7).getCell(5, Row.CREATE_NULL_AS_BLANK).getCellComment();
System.out.println(comment.getString());
will print "Hello".
To all martyrs who are using Apache POI and trying to do what OP want: use sheet.getCellComments() method - it retuns a TreeMap which keys are CellAddress'es and values are Comment instances, even if cells that handle them are null or missing.
To make that kind of cells visible for POI iterators, simply ask getCell method with MissingRowPolicy.CREATE_NULL_AS_BLANK.
e.g. (for Java 1.8+):
sheet.getCellComments().forEach((cellAddress, o) ->
sheet.getRow(cellAddress.getRow()).getCell(cellAddress.getColumn()
, Row.MissingCellPolicy.CREATE_NULL_AS_BLANK));

Drawing a horizontal line to the table at the end of page in iText?

I m creating a table using iText. Each table has 2 columns and have no borders except for left most, right most, top most and bottom most side of the table. I am able to achieve this but the problem occurs when new page begins. I want the to draw a horizontal line to the table at the end of page and another horizontal line when it begins. I have tried using
#Override
public void onEndPage(PdfWriter arg0, Document arg1) {
PdfPCell pdfpcells[] = pdfptable.getRow(pdfptable.getRows().size()-1).getCells();
pdfpcells[0].setBorderWidthBottom(0.5f);
if(pdfpcells[1] != null){ //There is a possibility that there are odd number of elements
pdfpcells[1].setBorderWidthBottom(0.5f);
}
}
for drawing horizontal line at the end of page assuming this function is called every time page ends and hence uses current number of rows. pdfptable is declared as class variable. This doesn't seem to work. I am using latest version of iText.
Thanks.
Can you post the code that constructs the table? Do you make one per page or are you relying on the auto-split of the PdfPTable?
The code below should do the trick:
PdfPCell pdfPCells[] = table.getRow(table.getRows().size() - 1).getCells();
for (PdfPCell pdfPCell : pdfPCells) {
pdfPCell.setBorder(PdfPCell.BOTTOM);
}
As you can see there is no need for you to worry about the number of elements in the array, if you just use a for-each loop.

Categories

Resources