Apache POI for excel get dependent cells - java

I am writing a program that reads excel files using apache POI. I'm getting all the values, but I want to know which cells are dependent on others (using the formula for the cell).
I've tried using String formula = cell.getCellFormula(), but this just returns me the cell index (eg. H5). Is there any other way I can do this?
Here's my code for reading cells:
private void handleCell(int type,Cell cell)
{
switch (type)
{
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_BOOLEAN:
System.out.print(cell.getBooleanCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_FORMULA:
String form = cell.getCellFormula();
handleCell(cell.getCachedFormulaResultType(),cell);
break;
default :
}
}

have a look at org.apache.poi.ss.formula.FormulaParser.
It has a static method
public static Ptg[] parse(
java.lang.String formula,
FormulaParsingWorkbook workbook,
int formulaType,
int sheetIndex)
according to the documentation, it parses a formula string into a List of tokens in RPN order.
The tokens (Ptg = "parse things") can be checked for their type (REF/VALUE/ARRAY) using public final byte getPtgClass().
I have not tested it, but it may be the way to go. Parse the formula, then check each Ptg entry for the type (REF?) and get the destination cell.
See:
https://poi.apache.org/apidocs/org/apache/poi/ss/formula/FormulaParser.html
https://poi.apache.org/apidocs/org/apache/poi/ss/formula/ptg/Ptg.html

Related

Problem with FormulaEvaluator cell - Apache POI

I have this strange situation and I need some tips on how to resolve it.
I have a column ( lets call it column K ) with values that are result of a FORMULA ( the values of this column are taken from another sheet). All the values on column K are set as String.
I use all the guidelines from the website: https://poi.apache.org/components/spreadsheet/eval.html
but I have a real problem to extract numbers ( example: 12345 ) and data ( 08/09/2022).
When i extract the number 12345 on java i have 12.34.5 and when i extract the date (08/09/2022) it gives me a value: 44813.0
A pseudocode that I was using is this one:
FileInputStream fis = new FileInputStream("/somepath/test.xls");
Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
Sheet sheet = wb.getSheetAt(0);
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
// suppose your formula is in B3
CellReference cellReference = new CellReference("B3");
Row row = sheet.getRow(cellReference.getRow());
Cell cell = row.getCell(cellReference.getCol());
if (cell!=null) {
switch (evaluator.evaluateFormulaCell(cell)) {
case Cell.CELL_TYPE_BOOLEAN:
System.out.println(cell.getBooleanCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.println(cell.getNumericCellValue());
break;
case Cell.CELL_TYPE_STRING:
System.out.println(cell.getStringCellValue());
break;
case Cell.CELL_TYPE_BLANK:
break;
case Cell.CELL_TYPE_ERROR:
System.out.println(cell.getErrorCellValue());
break;
// CELL_TYPE_FORMULA will never occur
case Cell.CELL_TYPE_FORMULA:
break;
}
}
Can someone give me some tips on how to resolve it?
Resolved:
Guys, as always some tools has their own logic that if you never work with them you will never know.
The solution was really easy and crazy :)
I select all the content of the column with the formula and other stuff in sheet 1, and i simply paste everything in another sheet 2.
At this moment from all the strange logic about formula, cached stuff, or real content on the cell,... after pasting in another sheet ( sheet 2 ) everything was visibile as a String without any Formula.
So just by doing cell.toString() i get the string value of everything.
Sometimes the easiest solutions are the most hardest thing to reason about lol.

XmlValueDisconnectedException when remove Formula

I trying to reade an Excel file with java poi. I iterate through the rows and then through the cells. To reade the cell i use this method:
private String readCell(Cell cell) {
try {
switch (cell.getCellType()) {
case NUMERIC:
if (format.isParseNumbersToInt()) {
return ((int) cell.getNumericCellValue()) + "";
} else {
return cell.getNumericCellValue() + "";
}
case STRING:
case _NONE:
return cell.getStringCellValue();
case FORMULA:
if (format.isUseCashedFormulaValue()) {
cell.removeFormula();
return readCell(cell);
} else {
return cell.getCellFormula() + "";
}
case BLANK:
return format.getBlankValue();
case BOOLEAN:
return cell.getBooleanCellValue() + "";
case ERROR:
if (format.isReadErrorCells()) {
return "ERROR_" + cell.getErrorCellValue();
} else {
return format.getErrorCellValue();
}
}
} catch (Exception e) {
throw new IllegalArgumentException("Failed to read cell: " + cell.getAddress(), e);
}
throw new IllegalStateException("Unknown CellType: " + cell.getCellType().name());
}
At one point the XmlValueDisconnectedException throws:
Caused by: org.apache.xmlbeans.impl.values.XmlValueDisconnectedException
at org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1274)
at org.apache.xmlbeans.impl.values.XmlObjectBase.getStringValue(XmlObjectBase.java:1529)
at org.apache.poi.xssf.usermodel.XSSFCell.convertSharedFormula(XSSFCell.java:491)
at org.apache.poi.xssf.usermodel.XSSFCell.getCellFormula(XSSFCell.java:469)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDeleteFormula(XSSFSheet.java:4654)
at org.apache.poi.xssf.usermodel.XSSFCell.removeFormulaImpl(XSSFCell.java:571)
at org.apache.poi.ss.usermodel.CellBase.removeFormula(CellBase.java:182)
at de.heuboe.base.excel.controller.reader.ExcelReader.readCell(ExcelReader.java:356)
... 89 more
This Point looks the same as all others. The Point of the file:
enter image description here
in D219 is the string in the following cells is a referenze to the cell one row over, e.g. D220: "=D219" and D221: "=D220". The same for the columns E, F and G.
The hole file looks like this and works but at this point the programm crashes. And i don't know why.
According the StackTrace, there is a problem with a shared formula.
If you have formulas =D6, =D7, =D8, ... =D219, =D220, ... and so on in column D, then not for all cells the complete formula is stored. Instead only one cell stores the complete formula and following cells only store shared reference to the formula.
In OOXML this looks like so :
In XML of cell D8: <f ref="D8:D300" t="shared" si="1">D7</f>
In XML of cell D9:D300: <f t="shared" si="1"/>
This Excel behavior tends to be fragile if somewhat else than Excel manipulates rows containing such shared formulas.
Cell.removeFormula is a pretty new feature in apache poi. It might be buggy. But as it is designed it should know about such shared formulas and respect those. So to get what really leads to that XmlValueDisconnectedException one would need the Excel file. There one could have a look into the sheet's XML and check whether someting in the shared formula's XML is different from the default which is expected by XSSFCell.convertSharedFormula.
But do you really need Cell.removeFormula? Because if the goal is simply to get the cashed formula value instead of the formula string itself but to avoid evaluating, then one could get that cashed formula value the same way as the other cell values but dependent on the cached formula result type.
Example:
...
case FORMULA:
if (isUseCashedFormulaValue) {
//cell.removeFormula();
//return readCell(cell);
switch (cell.getCachedFormulaResultType()) {
case NUMERIC:
return String.valueOf(cell.getNumericCellValue());
case STRING:
return cell.getStringCellValue();
case BOOLEAN:
return String.valueOf(cell.getBooleanCellValue());
case ERROR:
return "ERROR_" + cell.getErrorCellValue();
}
} else {
return cell.getCellFormula();
}
...

get Result in getCellFormula

I am reading Excel with java, In my Cell I have a formula, but I want get the result -> 20.000 or 2.154, etc, but I get ->
IF(F2="Buy",+(H2-G2+1)*I2,+(H2-G2+1)I2(-1))
switch (cell.getCellType()) {
case Cell.CELL_TYPE_FORMULA:
stringValue = cell.getCellFormula();
break;
....
the problem is that I could't calculated this formula, because I read the excel cell a cell then ... h2-g2... my code doesn't know this...
I am using
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.15</version>
</dependency>
how can I get the value of cell ?
Cell.CELL_TYPE_FORMULA: ---> IF(F2="Buy",+(H2-G2+1)*I2,+(H2-G2+1)*I2*(-1))
Cell.CELL_TYPE_STRING: ----> IF(F2="Buy",+(H2-G2+1)*I2,+(H2-G2+1)*I2*(-1))
Thanks!!
EDIT ,
I changed ->
case Cell.CELL_TYPE_FORMULA:
stringValue = String.valueOf(cell.getNumericCellValue());
break;
result -> 5.475E7 this result is bad, my Excel have 54.750.000 visually
The value you get is correct so you're almost there, it's just not formatted as you expect.
Try:
String stringValue = NumberFormat.getNumberInstance().format(cell.getNumericCellValue());

Retrieving Values From Excel Apache POI

Im trying to retrieve a specific value from each sheet in excel.
The code works fine for a test excel workbook but does not work with the excel file im trying to retrieve.
The error encountered is,
"Exception in thread "AWT-EventQueue-0" org.apache.poi.ss.formula.FormulaParseException: Specified named range 'Table156723451819202122232434567891011121314151619216710111213162024254567101718193456781112131623242528234789101314151619202128910111215234567891011121314161718192021222324252627282930312345678910111213141516171819202122232425262728293032234567891011140' does not exist in the current workbook."
The target excel book only has 1 month of sheets.(1jan,2jan...)
And the cell im targeting looks like this:
The target cell formula as follows: =SUM(D23:D24)
The following are my codes:
for(int i=0; i<=30; i++){
Sheet sheet = wb.getSheetAt(i);
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
CellReference cellReference = new CellReference("D25");
//CellReference cellReference = new CellReference("A4");
Row row = sheet.getRow(cellReference.getRow());
Cell cell = row.getCell(cellReference.getCol());
//System.out.println("i:"+row.getCell(cellReference.getCol()));
//CellValue cellValue = evaluator.evaluate(cell);
CellValue cellValue = evaluator.evaluate(cell);
switch (cellValue.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
System.out.println(cellValue.getBooleanValue());
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.println(cellValue.getNumberValue());
break;
case Cell.CELL_TYPE_STRING:
System.out.println(cellValue.getStringValue());
break;
case Cell.CELL_TYPE_BLANK:
break;
case Cell.CELL_TYPE_ERROR:
break;
// CELL_TYPE_FORMULA will never happen
case Cell.CELL_TYPE_FORMULA:
break;
}
I had tried to put the same formula in my test excel which works fine.
Do give any guidance for this as im stuck at this for very long. Thanks so much!
I figured it out.
Using .XLS format instead of .XLSX solved this problem.
Cheers! Happy Coding.

How to Add Validations(Numeric,Date) to a Particular Cell using POI

How to add a Integer Validation, Date Validation to a Particular Cell Using POI.
and validate after the user enters data, show an error message if data is wrong
thanks in advance
I once encountered a similar situation for validating an excel file. You can code like this:
if(cell != null){
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
//Validate String as required
break;
case Cell.CELL_TYPE_NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
//Validate Date
} else {
//Validate Number
}
break;
default:
//Handle Default
}
}
I suggest that you write separate validation handlers for each type (string, number and date) and just invoke them from your switch case.

Categories

Resources