I trying to reade an Excel file with java poi. I iterate through the rows and then through the cells. To reade the cell i use this method:
private String readCell(Cell cell) {
try {
switch (cell.getCellType()) {
case NUMERIC:
if (format.isParseNumbersToInt()) {
return ((int) cell.getNumericCellValue()) + "";
} else {
return cell.getNumericCellValue() + "";
}
case STRING:
case _NONE:
return cell.getStringCellValue();
case FORMULA:
if (format.isUseCashedFormulaValue()) {
cell.removeFormula();
return readCell(cell);
} else {
return cell.getCellFormula() + "";
}
case BLANK:
return format.getBlankValue();
case BOOLEAN:
return cell.getBooleanCellValue() + "";
case ERROR:
if (format.isReadErrorCells()) {
return "ERROR_" + cell.getErrorCellValue();
} else {
return format.getErrorCellValue();
}
}
} catch (Exception e) {
throw new IllegalArgumentException("Failed to read cell: " + cell.getAddress(), e);
}
throw new IllegalStateException("Unknown CellType: " + cell.getCellType().name());
}
At one point the XmlValueDisconnectedException throws:
Caused by: org.apache.xmlbeans.impl.values.XmlValueDisconnectedException
at org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1274)
at org.apache.xmlbeans.impl.values.XmlObjectBase.getStringValue(XmlObjectBase.java:1529)
at org.apache.poi.xssf.usermodel.XSSFCell.convertSharedFormula(XSSFCell.java:491)
at org.apache.poi.xssf.usermodel.XSSFCell.getCellFormula(XSSFCell.java:469)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDeleteFormula(XSSFSheet.java:4654)
at org.apache.poi.xssf.usermodel.XSSFCell.removeFormulaImpl(XSSFCell.java:571)
at org.apache.poi.ss.usermodel.CellBase.removeFormula(CellBase.java:182)
at de.heuboe.base.excel.controller.reader.ExcelReader.readCell(ExcelReader.java:356)
... 89 more
This Point looks the same as all others. The Point of the file:
enter image description here
in D219 is the string in the following cells is a referenze to the cell one row over, e.g. D220: "=D219" and D221: "=D220". The same for the columns E, F and G.
The hole file looks like this and works but at this point the programm crashes. And i don't know why.
According the StackTrace, there is a problem with a shared formula.
If you have formulas =D6, =D7, =D8, ... =D219, =D220, ... and so on in column D, then not for all cells the complete formula is stored. Instead only one cell stores the complete formula and following cells only store shared reference to the formula.
In OOXML this looks like so :
In XML of cell D8: <f ref="D8:D300" t="shared" si="1">D7</f>
In XML of cell D9:D300: <f t="shared" si="1"/>
This Excel behavior tends to be fragile if somewhat else than Excel manipulates rows containing such shared formulas.
Cell.removeFormula is a pretty new feature in apache poi. It might be buggy. But as it is designed it should know about such shared formulas and respect those. So to get what really leads to that XmlValueDisconnectedException one would need the Excel file. There one could have a look into the sheet's XML and check whether someting in the shared formula's XML is different from the default which is expected by XSSFCell.convertSharedFormula.
But do you really need Cell.removeFormula? Because if the goal is simply to get the cashed formula value instead of the formula string itself but to avoid evaluating, then one could get that cashed formula value the same way as the other cell values but dependent on the cached formula result type.
Example:
...
case FORMULA:
if (isUseCashedFormulaValue) {
//cell.removeFormula();
//return readCell(cell);
switch (cell.getCachedFormulaResultType()) {
case NUMERIC:
return String.valueOf(cell.getNumericCellValue());
case STRING:
return cell.getStringCellValue();
case BOOLEAN:
return String.valueOf(cell.getBooleanCellValue());
case ERROR:
return "ERROR_" + cell.getErrorCellValue();
}
} else {
return cell.getCellFormula();
}
...
Related
I have this strange situation and I need some tips on how to resolve it.
I have a column ( lets call it column K ) with values that are result of a FORMULA ( the values of this column are taken from another sheet). All the values on column K are set as String.
I use all the guidelines from the website: https://poi.apache.org/components/spreadsheet/eval.html
but I have a real problem to extract numbers ( example: 12345 ) and data ( 08/09/2022).
When i extract the number 12345 on java i have 12.34.5 and when i extract the date (08/09/2022) it gives me a value: 44813.0
A pseudocode that I was using is this one:
FileInputStream fis = new FileInputStream("/somepath/test.xls");
Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
Sheet sheet = wb.getSheetAt(0);
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
// suppose your formula is in B3
CellReference cellReference = new CellReference("B3");
Row row = sheet.getRow(cellReference.getRow());
Cell cell = row.getCell(cellReference.getCol());
if (cell!=null) {
switch (evaluator.evaluateFormulaCell(cell)) {
case Cell.CELL_TYPE_BOOLEAN:
System.out.println(cell.getBooleanCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.println(cell.getNumericCellValue());
break;
case Cell.CELL_TYPE_STRING:
System.out.println(cell.getStringCellValue());
break;
case Cell.CELL_TYPE_BLANK:
break;
case Cell.CELL_TYPE_ERROR:
System.out.println(cell.getErrorCellValue());
break;
// CELL_TYPE_FORMULA will never occur
case Cell.CELL_TYPE_FORMULA:
break;
}
}
Can someone give me some tips on how to resolve it?
Resolved:
Guys, as always some tools has their own logic that if you never work with them you will never know.
The solution was really easy and crazy :)
I select all the content of the column with the formula and other stuff in sheet 1, and i simply paste everything in another sheet 2.
At this moment from all the strange logic about formula, cached stuff, or real content on the cell,... after pasting in another sheet ( sheet 2 ) everything was visibile as a String without any Formula.
So just by doing cell.toString() i get the string value of everything.
Sometimes the easiest solutions are the most hardest thing to reason about lol.
I have a spreadsheet with a lot of formulas in it and several tabs. One of the tabs is for Input of numbers into 10 fields. Another tab is for viewing the output of calculated formulas.
Using Apache POI, I have opened the spreadsheet and input my numbers. The problem comes when I try to evaluate the spreadsheet.
I've tried
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
helper.createFormulaEvaluator();
evaluator.evaluateAll();
And I get an error (that nobody seems to have an answer for): Unexpected arg eval type (org.apache.poi.ss.formula.eval.MissingArgEval)] with root cause
So I've changed to evaluating cells individually so I could find which cell has the error, so my code looks like this:
FormulaEvaluator evaluator = this.workbook.getCreationHelper().createFormulaEvaluator();
for (Sheet sheet : this.workbook) {
System.out.println("Evaluating next sheet");
System.out.println(sheet.getSheetName());
for (Row r : sheet) {
System.out.println("Row Number:");
System.out.println(r.getRowNum());
for (Cell c : r) {
if (c.getCellType() == Cell.CELL_TYPE_FORMULA) {
System.out.println(c.getColumnIndex());
try {
evaluator.evaluateFormulaCell(c);
} catch (Exception e) {
rowArray.add(r.getRowNum());
cellArray.add(c.getColumnIndex());
System.out.println("Skipping failed cell");
}
}
}
}
And I'm getting the same error as when I run evaluateAll.
By putting the little bit of debugging in there, I found that the error is coming from Cell L3, which contains formula: =D5. Since the evaluator goes by row:column, it evaluates everything on row 3 first before getting to 5, so L3 references a field that has not been evaluated yet, and therefore throws an error.
I tried catching the errors and storing the row and cell number in an array, then after everything in a sheet is processed, attempt to reprocess the unprocessed cells, but I still get the same result. I'm a bit perplexed why the retry didn't work.
Retry code:
// try to fix any failed evaluations here
Iterator cellItr = cellArray.iterator();
Iterator rowItr = rowArray.iterator();
while (cellItr.hasNext()) {
Integer cellElement = (int) cellItr.next();
Integer rowElement = (int) rowItr.next();
XSSFRow row = sheet.getRow(rowElement);
XSSFCell cell = row.getCell(cellElement);
System.out.println("Re-evaluating: " + rowElement + " : " + cellElement);
evaluator.evaluateFormulaCell(cell);
}
}
The retry code gave the same result.
I tried changing the original evaluator to use evaluateInCell to change the formula to an actual number, but that didn't seem to help.
----------------- UPDATE ---------------------
I just realized that evaluateFormulaCell is deprecated in favor of evaluateFormulaCellEnum. I put all of the code into a function and ran the function multiple times and realized it's evaluating all of the cells over and over again, so I switched to using evaluateInCell and found that it only evaluates each cell once, but still can't get pass the cells mentioned.
Here is my updated code, which I have inside a function that I run 5 times:
for (Sheet sheet : this.workbook) {
System.out.println("Evaluating next sheet" + sheet.getSheetName());
for (Row r : sheet) {
for (Cell c : r) {
if (c.getCellType() == Cell.CELL_TYPE_FORMULA) {
System.out.println("Cell index: " + r.getRowNum() + " - " + c.getColumnIndex());
try {
evaluator.evaluateInCell(c);
} catch (Exception e) {
try {
evaluator.evaluateFormulaCellEnum(c);
} catch (Exception ee) {
System.out.println("Skipping failed cell after 2 attempts");
}
}
}
}
}
With the debugging I have in place, I was able to see which cells in the spreadsheet were failing, so I saved the formulas from the failing cells in a text document and replaced the formulas with their values, then recompiled the code and the spreadsheet actually evaluated!
Then I went through all of the cells and put their formulas back two by two until it broke again. It turned out to be a case I already knew about, but searching a spreadsheet for is no piece of cake.
This was the formula with the issue: =ROUNDUP('HW page'!$H$53*'HW page'!$H$54,)
I added a 0 as the last parameter so it looks like this: =ROUNDUP('HW assumptions'!$H$53*'HW assumptions'!$H$54,0), then the evaluator works.
I am writing a program that reads excel files using apache POI. I'm getting all the values, but I want to know which cells are dependent on others (using the formula for the cell).
I've tried using String formula = cell.getCellFormula(), but this just returns me the cell index (eg. H5). Is there any other way I can do this?
Here's my code for reading cells:
private void handleCell(int type,Cell cell)
{
switch (type)
{
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_BOOLEAN:
System.out.print(cell.getBooleanCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_FORMULA:
String form = cell.getCellFormula();
handleCell(cell.getCachedFormulaResultType(),cell);
break;
default :
}
}
have a look at org.apache.poi.ss.formula.FormulaParser.
It has a static method
public static Ptg[] parse(
java.lang.String formula,
FormulaParsingWorkbook workbook,
int formulaType,
int sheetIndex)
according to the documentation, it parses a formula string into a List of tokens in RPN order.
The tokens (Ptg = "parse things") can be checked for their type (REF/VALUE/ARRAY) using public final byte getPtgClass().
I have not tested it, but it may be the way to go. Parse the formula, then check each Ptg entry for the type (REF?) and get the destination cell.
See:
https://poi.apache.org/apidocs/org/apache/poi/ss/formula/FormulaParser.html
https://poi.apache.org/apidocs/org/apache/poi/ss/formula/ptg/Ptg.html
How to add a Integer Validation, Date Validation to a Particular Cell Using POI.
and validate after the user enters data, show an error message if data is wrong
thanks in advance
I once encountered a similar situation for validating an excel file. You can code like this:
if(cell != null){
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
//Validate String as required
break;
case Cell.CELL_TYPE_NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
//Validate Date
} else {
//Validate Number
}
break;
default:
//Handle Default
}
}
I suggest that you write separate validation handlers for each type (string, number and date) and just invoke them from your switch case.
In my program I am reading in and parsing a file for resources.
I extract a string which represents the resource type, do a simple if then else statement to check if it matches any known types and throw an error if it doesn't:
if(type.toLowerCase() == "spritesheet") {
_type = ResourceType.Spritesheet;
} else if(type.toLowerCase() == "string") {
_type = ResourceType.String;
} else if(type.toLowerCase() == "texture") {
_type = ResourceType.Texture;
} else if(type.toLowerCase() == "num") {
_type = ResourceType.Number;
} else {
throw new Exception("Invalid Resource File - Invalid type: |" + type.toLowerCase() + "|");
}
Ignoring my bad naming and non descript exception, this statement is always going to the final else, even if type IS "spritesheet" as read in from the file, etc.
java.lang.Exception: Invalid Resource File - Invalid type: |spritesheet|
at Resource.Load(Resource.java:55) //Final else.
If I set type to "spritesheet" before this call, it works, so I'm wondering if it's some kind of encoding error or something?
I haven't done much work in java so I might be missing something simple :)
Assuming type is a String, you want to use String.equals() to test for equality. Using the == operator tests to see if the variables are references to the same object.
Also, to make your life easier, I would suggest using String.equalsIgnoreCase() as this will save you from calling toLowerCase().
Starting from Java 7 you can use Strings in switch statements! :)
The following should work:
switch (type.toLowerCase()) {
case "spritesheet": _type = ResourceType.Spritesheet; break;
case "string": _type = ResourceType.String; break;
case "texture": _type = ResourceType.Texture; break;
case "num": _type = ResourceType.Number; break;
default: throw new Exception("Invalid Resource File " +
"- Invalid type: |" + type.toLowerCase() + "|");
}
I haven't tried it yet, let me know how it goes!