We are using apache poi 3.8 to parse excels. We need to be able to detect (and skip) hidden rows as they tend to contain junk data in our usecases.
It would seem this should work:
row.isFormatted() && row.getRowStyle().getHidden()
But there never appears to be any row-level formatting (getRowStyle() always returns null). As a last resort we thought checking cell styles might work:
for (int i = 0; i < row.getLastCellNum(); i++) {
Cell cell = row.getCell(i);
if (cell != null && cell.getCellStyle() != null && cell.getCellStyle().getHidden())
...
But for every row we get (custom output in the above for loop):
Cell 0 is not hidden org.apache.poi.hssf.usermodel.HSSFCellStyle#1b9142d0 / false
Does the "getHidden()" not work or does it not work as I think it does? Is there another way to detect hidden rows? (hidden columns would also be a nice bonus but slightly less relevant atm)
getRowStyle should normally work as you supposed.
Otherwise, you can check the height of the row, as hidden rows tends to have a height set to 0.
Using row.getHeight() or row.getZeroHeight().
After trying a few approaches, row.getZeroHeight() worked correctly for identifying hidden row. Also for those who are stuck with Apache POI <= 3.7 version this may be the only solution.
Related
I have a excel file with a cell that generates the number 3.69 (based on calculations from proceeding numbers)
However when pulling that number in java using
if (brightCell.getNumericCellValue()) > 0 )
{
double brightness = brightCell.getNumericCellValue();
return brightness;
}
I've also tried:
if (Double.parseDouble(brightCell.getStringCellValue()) > 0 )
{
double brightness = Double.parseDouble(brightCell.getStringCellValue());
return brightness;
}
brightCell is instantiated with :
brightCell = spreadsheet.getRow(new CellReference(brightString).getRow()).getCell(new CellReference(brightString).getCol());
brightString is String brightString = "BV29"
But with both solutions, brightness receives the value, 3.2133....
So thanks to #Igor I managed to figure it out but it led to more issues.
So the solution was creating an evaluator
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
evaluator.setIgnoreMissingWorkbooks(true); //if you need it
when you finish setting the required cells and want to evaluate.
evaluator.EvaluateAll();
The problem for me is I'm doing this multiple times and my 1st resut is correct but upon the second iteration it becomes skewed, and more skewed.
What I'm doing is setting various cells (via java) then before I retrieve the value for a cell (that contains a formula) I run EvaluateAll. Now, I'm not sure if I should be evaluating after EVERY change or after I make all my changes to the excel sheet (via java).
I can't evaluate a specific cell at a time because there's over 38 sheets with multitudes of formulas. So EvaluateAll is the best option for me
EDIT 26/10/2018*
So the issue was not clearing the cache after making inputs. The solution was after each input as specified in the javaDoc that:
Should be called whenever there are changes to input cells in the evaluated workbook.
Failure to call this method after changing cell values will cause incorrect behaviour
of the evaluate~ methods of this class
therefore after making an input on a cell you should call evaluator.clearAllCachedResultValues();
I am creating a Java application in which I am interacting with document file .docx. I am using Apache POI to generate it and modify into it.
The existing file is having tables like:
Tables that are created initially.
I have to add the rows in the first table so that the output is something like: The way the 2 tables should be shown
So this is like I am allowing table 1 to occupy the space till it can go in document page, it is followed by another table. This type of table is needed because after that I am removing its inside borders and it shows something like: how it should be shown with inside borders removed.
I am adding the blank rows using this piece of code..
for (int i = 0; i < itemCount; i++) {
oldRow = table.getRow(i);
newRow = table.insertNewTableRow(i + 1);
for (int j = 0; j < oldRow.getTableCells().size(); j++) {
cell = newRow.createCell();
CTTcPr ctTcPr = cell.getCTTc().addNewTcPr();
ctTcPr.addNewTcBorders().addNewTop().setVal(STBorder.NIL);
ctTcPr.addNewTcBorders().addNewBottom().setVal(STBorder.NIL);
CTTblWidth cellWidth = ctTcPr.addNewTcW();
cellWidth.setType(oldRow.getCell(j).getCTTc().getTcPr().getTcW().getType());
// sets type of width
BigInteger width = oldRow.getCell(j).getCTTc().getTcPr().getTcW().getW();
cellWidth.setW(width); // sets width
if (oldRow.getCell(j).getCTTc().getTcPr().getGridSpan() != null) {
ctTcPr.setGridSpan(oldRow.getCell(j).
getCTTc().getTcPr().getGridSpan()); // sets grid span if any
}
}
}
In few of the lines in starting, I am adding value as something like:
paragraph = row.getCell(0).getParagraphArray(0);
if (paragraph == null) {
paragraph = row.getCell(0).addParagraph();
}
run = setRunAndParagraph(paragraph);
run.setText(itemNames[i]);
All this works nicely, this code was helped by a great coder here who helped me to solve earlier issue.
Now, when I am inserting the rows, I can manually count that for A4 size page, this many rows will be there in table 1, and can iterate likewise. But, the issue is, some item names are wider than column length and as word wrap is on, row for that particular item takes height twice as normal. So, I can get number of rows but not exactly decide how many lines occupied by the items.
The rows appended are for better look and application specific so I thought of an approach that may work:
I am adding page numbers in footer, after inserting a blank row, I am checking whether total page numbers of a document changed or not, if yes that means the code stops there. And as there's 1 excess row I am removing it.
But the issue here arise is I am not able to get page number. I tried
document.getProperties().getExtendedProperties().getUnderlyingProperties().getPages();
But that always shows 1 even if total pages becomes 2. I searched for an approach stating to check form-feed but I think I am having tables here so I don't know it will be helpful approach or not. So I went for adding page number in the page as a field but that also gave 1 every time. Then I tried to do it with footer and this is how I am writing the page number.
XWPFHeaderFooterPolicy p = document.createHeaderFooterPolicy();
XWPFFooter f = p.createFooter(XWPFHeaderFooterPolicy.DEFAULT);
paragraph = f.createParagraph();
//paragraph.createRun().setText("Page: ");
paragraph.createRun();
paragraph.getCTP().addNewFldSimple().setInstr("PAGE \\* MERGEFORMAT");
Now I don't know how to get value of this field, I tried getText() on footer but it did not gave the number. I read that fields are related to paragraph so I tried to get the paragraph from the footer and tried to call getText() but the output is nothing. So can someone help me with this scenario? Right now I am getting the response like this: some part of second table going in next page.
I tried to implement solution I found, but I could not get it resolved yet. so if someone can help me in this scenario like how to fetch this page numbers from the field inserted in the document, it would be really great. I tried to provide more details so that if there's a better approach than this then also someone can suggest the same. Thank you.. :)
I have an empty spreadsheet, but when I'm accessing it with Apache POI (version 3.10), it says it has 1024 columns and 20 physical columns.
I really deleted all the cells, only some formatting remains, but no content.
And if I delete some columns with LibreOffice Calc (version 4.1.3.2), the number of columns only increases! What's going on?
Is there a reliable way to get the real number of columns (or cells in a row)?
import java.net.URL;
import org.apache.poi.ss.usermodel.*;
public class Test {
public static void main(final String... args) throws Exception {
final URL url = new URL("http://aditsu.net/empty.xlsx");
final Workbook w = WorkbookFactory.create(url.openStream());
final Row r = w.getSheetAt(0).getRow(0);
System.out.println(r.getLastCellNum());
System.out.println(r.getPhysicalNumberOfCells());
}
}
After some more investigation, I think I figured out what's happening.
First, some terminology from POI: there are some cells that don't actually exist at all in the spreadsheet - those are called missing, or undefined/not defined. Then there are some cells that are defined, but have no value - those are called blank cells. Both types of cells appear empty in a spreadsheet program and can't be distinguished visually.
My spreadsheet has some blank cells that LibreOffice added at the end of the row (possibly a bug). When I delete columns, LibreOffice seems to shift the subsequent cells (including the blank ones) to the left, and adds more blank cells at the end (up to 1024).
And now the key part: neither getLastCellNum() nor getPhysicalNumberOfCells() ignore blank cells. getLastCellNum() gives the last defined cell, and getPhysicalNumberOfCells() gives the number of defined cells, both including blank cells. There doesn't seem to be any method available that skips blank cells. The javadoc for getPhysicalNumberOfCells() is somewhat misleading - "if only columns 0,4,5 have values then there would be 3", but it's actually counting blank cells too, which don't really have values.
So the only solution I found is to loop through the cells and check if they are blank.
Side note: getLastRowNum() and getFirstCellNum() are 0-based but getLastCellNum() is 1-based, wtf?
Most likely you have some kind of formatting applied for you row. I have an empty xlsx file created with excel and method getRow produces null for empty rows.
#aditsu as per https://poi.apache.org/apidocs/dev/org/apache/poi/ss/usermodel/Row.html, getLastCellNum() gets the index of the last cell contained in this row PLUS ONE.
+1 for libreOffice strugle! it's a bug, and in my opinion is very random. I'm getting null randomly, and often helps if I delete EMPTY rows (bellow) and EMPTY columns (on the right side).
...
I'm trying to read comments from all Excel documents cell's (using Apache POI).
I have problem when empty (or missing) cells contains comments.
Currently only solutions that I found is to:
iterate every row to last not empty column
get all (even empty) cells
check if cell's comment is not empty
if true: handle comment
Some code:
if (row != null) {
cell = row.getCell(cellNum, Row.CREATE_NULL_AS_BLANK);
cellComment = cell.getCellComment();
if (cellComment != null)
...
}
Main problems is that I can't read comments from empty lines and comments which are after last not empty cell.
Increasing performance (comparing to reading all row cells) would be nice, but main point is to read ALL documents comments.
You can read the comments of Blank Cell or Null cell using missingCellPolicy, row.getCell(int cellnum, MissingCellPolicy policy) which allows you to deal with the cells which are blank or null.
For example in your sheet, say the 7th row is blank and its 5th col have some comment (say "Hello"), and you need to read that comment. just do the following:
Comment comment = sheet.getRow(7).getCell(5, Row.CREATE_NULL_AS_BLANK).getCellComment();
System.out.println(comment.getString());
will print "Hello".
To all martyrs who are using Apache POI and trying to do what OP want: use sheet.getCellComments() method - it retuns a TreeMap which keys are CellAddress'es and values are Comment instances, even if cells that handle them are null or missing.
To make that kind of cells visible for POI iterators, simply ask getCell method with MissingRowPolicy.CREATE_NULL_AS_BLANK.
e.g. (for Java 1.8+):
sheet.getCellComments().forEach((cellAddress, o) ->
sheet.getRow(cellAddress.getRow()).getCell(cellAddress.getColumn()
, Row.MissingCellPolicy.CREATE_NULL_AS_BLANK));
Using Apache POI, I'm able to find a named range:
XSSFName[] ranges = new XSSFName[workbook.getNumberOfNames()];
for (int i = 0; i < _wb.getNumberOfNames(); i++)
ranges[i] = workbook.getNameAt(i);
With that, I'm able to cell an AreaReference:
AreaReference area = new AreaReference(ranges[0].getRefersToFormula());
And then finally I can get all the cells within that range:
CellReference[] cells = area.getAllReferencedCells();
That all works just fine. Burt I have a use case where I have to redefine the area that the range covers. Is there a way to do that? I notice that the range.getRefersToFormula() method return a String, something like MySheet!$A$1:$B$8. There is a range.setRefersToFormula(String formula), but I've got to believe there's a way other than resorting to writing an excel range formula parser on my own. Is there no way to generate an AreaReference with a set to Cell references of something more type-safe? Do I actually have to generate a String to represent the new range? I would think there would be API somewhere to help me with this but I can't seem to find it.
Update
I found some API, but it doesn't seem to work, at least it doesn't save properly. Here's what I did.
AreaReference newArea = new AreaReference(firstCell, lastCell);
ranges[0].setRefersToFormula(newArea.formatAsString())
It seems to set the formula correctly, but when I stream the workbook back out to disk, the range is completely wrong.
you can update the existing Reference and set it as per your requirement.
Suppose the reference contains TestSheet!$A$1:$B$8and you want to change it to MySheet!$B$5:$C$12
For any cell, say "B5", at runtime,
cell.getReference();
will give you cell reference (like in example... it will return you "B5")
char startCellColRef = cell.getReference().toString().charAt(0);
will give you the Column Reference (will give you "B" if the current cell is B5). Now
int startCellRowRef = cell.getReference().toString().charAt(1);
will give you Row Index (will give you "5" if the current cell is B5).
By the same way you can get your start and end cell references (say B5 and C12).
Now comes how can I update the existing references. Just update its value with newly created reference string
Name reference = wb.getName("NameReferenceInExcelSheet");
referenceString = sheetName+"!$"+startCellColRef+"$"+startCellRowRef+":$"+endCellColRef+"$"+endCellRowRef;
reference.setRefersToFormula(referenceString);