The application I am working on creates Excel exports using Apache POI. It was brought to our attention, through a security audit, that cells containing malicious values can spawn arbitrary processes if the user is not careful enough.
To reproduce, run the following:
import java.io.FileOutputStream;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
public class BadWorkbookCreator {
public static void main(String[] args) throws Exception {
try(
Workbook wb = new HSSFWorkbook();
FileOutputStream fos = new FileOutputStream("C:/workbook-bad.xls")
) {
Sheet sheet = wb.createSheet("Sheet");
Row row = sheet.createRow(0);
row.createCell(0).setCellValue("Aaaaaaaaaa");
row.createCell(1).setCellValue("-2+3 +cmd|'/C calc'!G20");
wb.write(fos);
}
}
}
Then open the resulting file:
And follow these steps:
Click on (A) to select the cell with malicious content
Click on (B) so that the cursor is in the formula editor
Press ENTER
You will be asked if you allow Excel to run an external application; if you answer yes, Calc is launched (or any malicious code)
One may say that the user is responsible for letting Excel run arbitrary things and the user was warned. But still, the Excel is downloaded from a trusted source and someone may fall into the trap.
Using Excel, you can place a single quote in front of the text in the formula editor to escape it. Placing the single quote in the cell content programmatically (e.g. code as below) makes the single quote visible!
String cellValue = cell.getStringCellValue();
if( cellValue != null && "=-+#".indexOf(cellValue.charAt(0)) >= 0 ) {
cell.setCellValue("'" + cellValue);
}
The question: Is there a way to keep the value escaped in the formula editor, but show the correct value, without the leading single quote, in the cell?
Thanks to the hard work investigating of Axel Richter here and Nikos Paraskevopoulos here....
From Apache POI 3.16 beta 1 onwards (or for those who live dangerously, any nightly build after 20161105), there are handy methods on CellStyle for getQuotePrefixed and setQuotePrefixed(boolean)
Your code could then become:
// Do this once for the workbook
CellStyle safeFormulaStyle = workbook.createCellStyle();
safeFormulaStyle.setQuotePrefixed(true);
// Per cell
String cellValue = cell.getStringCellValue();
if( cellValue != null && "=-+#".indexOf(cellValue.charAt(0)) >= 0 ) {
cell.setCellStyle(safeFormulaStyle);
}
Thanks to the instant (kudos) response from the POI team (see accepted answer), this solution should be obsolete. Keeping it as a reference, could be useful in cases an upgrade to POI >= 3.16 is not possible.
Thanks to the comment of Axel Richter (for which I am very-very thankful) I managed to work out a solution. It is definitely NOT as straightforward as in the case of XLSX files (XSSFWorkbook), because it involves creating the org.apache.poi.hssf.model.InternalWorkbook by hand; this class is marked as #Internal by the POI project, but is public as far as Java is concerned. Additionally, the field that is set to correct the problem, i.e. ExtendedFormatRecord.set123Prefix(true) is not documented!
Here is the solution, for what it's worth - compare it with the code in the question:
import java.io.FileOutputStream;
import org.apache.poi.hssf.model.InternalWorkbook;
import org.apache.poi.hssf.record.ExtendedFormatRecord;
import org.apache.poi.hssf.usermodel.HSSFCellStyle;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
public class GoodWorkbookCreator {
public static void main(String[] args) throws Exception {
InternalWorkbook internalWorkbook = InternalWorkbook.createWorkbook();
try(
HSSFWorkbook wb = HSSFWorkbook.create(internalWorkbook);
FileOutputStream fos = new FileOutputStream("C:/workbook-good.xls")
) {
HSSFCellStyle style = (HSSFCellStyle) wb.createCellStyle();
ExtendedFormatRecord xfr = internalWorkbook.getExFormatAt(internalWorkbook.getNumExFormats() - 1);
xfr.set123Prefix(true); // THIS IS WHAT IT IS ALL ABOUT
Sheet sheet = wb.createSheet("Sheet");
Row row = sheet.createRow(0);
row.createCell(0).setCellValue("Aaaaaaaaaa");
row.createCell(1).setCellValue("-2+3 +cmd|'/C calc'!G20");
Cell cell = row.createCell(2);
cell.setCellValue("-2+3 +cmd|'/C calc'!G20");
cell.setCellStyle(style);
wb.write(fos);
}
}
}
Related
I'm trying to use the method sheet.setActiveCell(CellAddress addr) to set a range of multiple cells active at the same time. I've tryed with multiple versions of Apache poi-ooxml library and now i'm using 3.16 which also supports the method sheet.setActiveCell(String addr)(I know 3.16 is old but the issue stays the same also with the latest version).
Following the suggestions on this question: Is it possible to set the active range with Apache POI XSSF?
I've managed to get it to work, both with the custom CellAddress and the String in the format "A1:B5".
The problem is that every time I try to open an xlsx in which a range of cells has been set to active using apache poi, I get an error message from Excel saying that the file is damaged and need to be recovered. If I do, the recovery completes correctly, but this error is annoying since I have to open a great number of these files each day.
Is there a way to avoid this error from excel (maybe modifying the creation of the xlsx or changing some setting in Excel)?
Only one cell can be the active cell. And Sheet.setActiveCell only sets that one active cell. So sheet.setActiveCell("A1:B5") will work if setActiveCell(String addr) is available but it leads to a corrupted sheet. That's why it was removed.
Multiple cells can be selected. But there are no methods to set the selected cells in apache poi's high level classes. So the underlying low level classes needs to be used. Doing this one needs differentiate between XSSF and HSSF because different low level classes needs to be used.
Following complete example sets active cell to B2. This also sets sheet view having selection and active cell to that one given cell B2. Then it uses low level methods of XSSF and HSSF to set the selection to B2:E5.
import java.io.*;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.CellAddress;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFSheet;
class CreateExcelSelectMultipleCells {
public static void main(String[] args) throws Exception {
try (Workbook workbook = new XSSFWorkbook(); FileOutputStream out = new FileOutputStream("Excel.xlsx") ) {
//try (Workbook workbook = new HSSFWorkbook(); FileOutputStream out = new FileOutputStream("Excel.xls") ) {
Sheet sheet = workbook.createSheet();
Row row;
Cell cell;
for (int r = 0; r < 6; r++) {
row = sheet.createRow(r);
for (int c = 0; c < 6; c++) {
cell = row.createCell(c);
cell.setCellValue("R" + (r+1) + "C" + (c+1));
}
}
// set active cell; this also sets sheet view having selection and active cell to one given cell
sheet.setActiveCell(new CellAddress("B2"));
// set selected cells
if (sheet instanceof XSSFSheet) {
XSSFSheet xssfSheet = (XSSFSheet) sheet;
xssfSheet.getCTWorksheet().getSheetViews().getSheetViewArray(0).getSelectionArray(0).setSqref(
java.util.Arrays.asList("B2:E5"));
} else if (sheet instanceof HSSFSheet) {
HSSFSheet hssfSheet = (HSSFSheet) sheet;
org.apache.poi.hssf.record.SelectionRecord selectionRecord = hssfSheet.getSheet().getSelection();
java.lang.reflect.Field field_6_refs = org.apache.poi.hssf.record.SelectionRecord.class.getDeclaredField("field_6_refs");
field_6_refs.setAccessible(true);
field_6_refs.set(
selectionRecord,
new org.apache.poi.hssf.util.CellRangeAddress8Bit[] { new org.apache.poi.hssf.util.CellRangeAddress8Bit(1,4,1,4) }
);
}
workbook.write(out);
}
}
}
I've been trying to build my first web application using IntelliJ and Tomcat, and one of the tasks is being able to upload and process an Excel sheet file. So, I looked up online, and found the Apache POI library that can help me parse an Excel file. But when I downloaded all the required jars and copied and pasted some testing code, and start up the server, it shows on the webpage an error with http status 500, the root cause being: java.lang.ClassNotFoundException: org.apache.poi.openxml4j.opc.internal.marshallers.PackagePropertiesMarshaller$NamespaceImpl.
I've encountered the problem with other jars, but all solved by putting the corresponding jars inside tomcat's lib folder, just except for this one.
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.File;
import java.io.FileInputStream;
import java.util.Iterator;
public class ExcelParser {
private String pathname;
public ExcelParser(String pathname) {
this.pathname = pathname;
}
public void parse() {
try {
FileInputStream file = new FileInputStream(new File("/Users/JohnDoe/Desktop/test.xlsx"));
//Create Workbook instance holding reference to .xlsx file
XSSFWorkbook workbook = new XSSFWorkbook(file);
//Get first/desired sheet from the workbook
XSSFSheet sheet = workbook.getSheetAt(0);
//Iterate through each rows one by one
Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
//For each row, iterate through all the columns
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
//Check the cell type and format accordingly
switch (cell.getCellType()) {
case NUMERIC:
System.out.print(cell.getNumericCellValue() + "t");
break;
case STRING:
System.out.print(cell.getStringCellValue() + "t");
break;
}
}
System.out.println();
}
file.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
I'm just testing the functionality of Excel parsing, so don't really worry about the pathname.
Btw, I can see that this (inner) class is declared in poi-ooxml4-4.1.0.jar, which is also included in my Tomcat lib folder.
Any ideas why this is happening, and how I should fix it is appreciated.
To use Apache POI, you need the following jar files.
poi-ooxml-4.1.0.jar
poi-ooxml-schemas-4.1.0.jar
xmlbeans-3.1.0.jar
commons-compress-1.18.jar
curvesapi-1.06.jar
poi-4.1.0.jar
commons-codec-1.12.jar
commons-collections4-4.3.jar
commons-math3-3.6.1.jar
You can refer to the following link, which I have answered few things.
Unable to read Excel using Apache POI
I think I missed something when moving the jars to the lib directory, as I removed the original files and redo the cp command, everything works now. I'm closing the question with answer, thanks for the help!
I need to write to an excel cell a very large numbers(>91430000000000000000)
The issue is that max value for cell is 9143018315613270000, and all values which is larger - would be replaced by max value.
This issue will simply resolved by hands if an apostrophe is added to an number, for example '9143018315313276189
But how to the same trick via apache POI? I have follow code:
attrId.setCellValue(new XSSFRichTextString('\'' + value.getId().toString()));
But it doesn't work:
Here the first row haven't any apostrophe at all, second one is written by hands and it is the result I'm looking for. Third is a result of my code. I also tried to use setCellValue which takes double and String, both of them doesn't help me ether.
So, here goes the question: How to write in excel a very large numbers via apache POI?
Set the cell style first
DataFormat format = workbook.createDataFormat();
CellStyle testStyle = workbook.createCellStyle();
testStyle.setDataFormat(format.getFormat("#"));
String bigNumber = "9143018315313276189";
row.createCell(40).setCellStyle(testStyle);
row.getCell(40).setCellValue(bigNumber);
Can you set the Cell type and see what happens. Or if you have already set that then please post your code so that others look at it.
cell.setCellType(Cell.CELL_TYPE_STRING);
Please refer to the question in here for details on how to set string value to cell How can I read numeric strings in Excel cells as string (not numbers) with Apache POI?
I did the following sample and worked for me (poi-3.1.3)
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
public class WriteToExcel {
public static void main(String[] args) throws IOException {
HSSFWorkbook workbook = new HSSFWorkbook();
HSSFSheet sheet = workbook.createSheet("Sample sheet");
Row row = sheet.createRow(0);
Cell cell = row.createCell(0);
cell.setCellType(Cell.CELL_TYPE_STRING);
cell.setCellValue("91430183153132761893333");
try {
FileOutputStream out =
new FileOutputStream(new File("C:\\test_stackoverflow\\new.xls"));
workbook.write(out);
out.close();
System.out.println("Excel written successfully..");
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
I am trying the this testfile with the Apache POI API (current version 3-10-FINAL). The following test code
import java.io.FileInputStream;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ExcelTest {
public static void main(String[] args) throws Exception {
String filename = "testfile.xlsx";
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(filename));
XSSFSheet sheet = wb.getSheetAt(0);
System.out.println(sheet.getFirstRowNum());
}
}
results in the first row number to be -1 (and existing rows come back as null). The test file was created by Excel 2010 (I have no control over that part) and can be read with Excel without warnings or problems. If I open and save the file with my version of Excel (2013) it can be read perfectly as expected.
Any hints into why I can't read the original file or how I can is highly appreciated.
The testfile.xlsx is created with "SpreadsheetGear 7.1.1.120". Open the XLSX file with a software which can deal with ZIP archives and look into /xl/workbook.xml to see that. In the worksheets/sheet?.xml files is to notice that all row elements are without row numbers. If I put a row number in the first row-tag like <row r="1"> then apache POI can read this row.
If it comes to the question, who is to blame for this, then the answer is definitely both Apache Poi and SpreadsheetGear ;-). Apache POI because the attribute r in the row element is optional. But SpreadsheetGear also because there is no reason not to use the r attribute if Excel itself does it ever.
If you cannot get the testfile.xlsx in a format which can Apache POI read directly, then you must work with the underlying objects. The following works with your testfile.xlsx:
import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.FileInputStream;
import java.io.InputStream;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow;
import java.util.List;
class Testfile {
public static void main(String[] args) {
try {
InputStream inp = new FileInputStream("testfile.xlsx");
Workbook wb = WorkbookFactory.create(inp);
Sheet sheet = wb.getSheetAt(0);
System.out.println(sheet.getFirstRowNum());
CTWorksheet ctWorksheet = ((XSSFSheet)sheet).getCTWorksheet();
CTSheetData ctSheetData = ctWorksheet.getSheetData();
List<CTRow> ctRowList = ctSheetData.getRowList();
Row row = null;
Cell[] cell = new Cell[2];
for (CTRow ctRow : ctRowList) {
row = new MyRow(ctRow, (XSSFSheet)sheet);
cell[0] = row.getCell(0);
cell[1] = row.getCell(1);
if (cell[0] != null && cell[1] != null && cell[0].toString() != "" && cell[1].toString() != "")
System.out.println(cell[0].toString()+"\t"+cell[1].toString());
}
} catch (InvalidFormatException ifex) {
} catch (FileNotFoundException fnfex) {
} catch (IOException ioex) {
}
}
}
class MyRow extends XSSFRow {
MyRow(org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow row, XSSFSheet sheet) {
super(row, sheet);
}
}
I have used:
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow
Which are part of the Apache POI Binary Distribution poi-bin-3.10.1-20140818 and there are within poi-ooxml-schemas-3.10.1-20140818.jar
For a documentation see http://grepcode.com/snapshot/repo1.maven.org/maven2/org.apache.poi/ooxml-schemas/1.1/
And I have extend XSSFRow, because we can't use the XSSFRow constructor directly since it has protected access.
I'm trying to get updated cell values after use setForceFormulaRecal method. But I'm getting still old values. Which is not actual result. If I opened Original file by clicking It will asking update Links dialogue box. If I click "ok" button then Its updating all cell formula result. So I want to update excel sheet links before its open by using poi. Please help in this situation.
//Before Setting values
HSSFCell cel2=row1.getCell(2);
HSSFCell cel4=row1.getCell(5);
cel2.setCellValue(690);
cel4.setCellValue(690);
wb.setForceFormulaRecalculation(true);
wb.write(stream);
//After Evaluatting the work book formulas I'm trying as follow
HSSFWorkbook wb = HSSFReadWrite.readFile("D://workspace//ExcelProject//other.xls");
HSSFSheet sheet=wb.getSheetAt(14);
HSSFRow row11=sheet.getRow(10);
System.out.println("** cell val: "+row11.getCell(3).getNumericCellValue());
I'm Also tried with Formula Evaluator But its showing errors As follow
Could not resolve external workbook name '\Users\asus\Downloads\??? & ???? ?????_091230.xls'. Workbook environment has not been set up.
at org.apache.poi.ss.formula.OperationEvaluationContext.createExternSheetRefEvaluator(OperationEvaluationContext.java:87)
at org.apache.poi.ss.formula.OperationEvaluationContext.getArea3DEval(OperationEvaluationContext.java:273)
at org.apache.poi.ss.formula.WorkbookEvaluator.getEvalForPtg(WorkbookEvaluator.java:660)
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateFormula(WorkbookEvaluator.java:527)
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:288)
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluate(WorkbookEvaluator.java:230)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateFormulaCellValue(HSSFFormulaEvaluator.java:351)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateFormulaCell(HSSFFormulaEvaluator.java:213)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateAllFormulaCells(HSSFFormulaEvaluator.java:324)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateAll(HSSFFormulaEvaluator.java:343)
at HSSFReadWrite.readSheetData(HSSFReadWrite.java:85)
at HSSFReadWrite.main(HSSFReadWrite.java:346)
Caused by: org.apache.poi.ss.formula.CollaboratingWorkbooksEnvironment$WorkbookNotFoundException: Could not resolve external workbook name '\Users\asus\Downloads\??? & ???? ?????_091230.xls'. Workbook environment has not been set up.
at org.apache.poi.ss.formula.CollaboratingWorkbooksEnvironment.getWorkbookEvaluator(CollaboratingWorkbooksEnvironment.java:161)
at org.apache.poi.ss.formula.WorkbookEvaluator.getOtherWorkbookEvaluator(WorkbookEvaluator.java:181)
at org.apache.poi.ss.formula.OperationEvaluationContext.createExternSheetRefEvaluator(OperationEvaluationContext.java:85)
... 11 more
OK, trying an answer:
First of all: Support for links to external workbooks is not included into the current stable version 3.10. So with this version it is not possible to evaluate such links directly. That's why evaluateAll() will fail for workbooks with links to external workbooks.
With Version 3.11 it will be possible to do so. But also only even if all the workbooks are opened and Evaluators for all the workbooks are present. See: http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/FormulaEvaluator.html#setupReferencedWorkbooks%28java.util.Map%29
What we can do with the stable version 3.10, is to evaluate all the cells which contains formulas which have not links to external workbooks.
Example:
The workbook "workbook.xlsx" contains a formula with a link to an external workbook in A2:
import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import java.io.FileOutputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.Map;
import java.util.HashMap;
class ExternalReferenceTest {
public static void main(String[] args) {
try {
InputStream inp = new FileInputStream("workbook.xlsx");
Workbook wb = WorkbookFactory.create(inp);
Sheet sheet = wb.getSheetAt(0);
Row row = sheet.getRow(0);
if (row == null) row = sheet.createRow(0);
Cell cell = row.getCell(0);
if (cell == null) cell = row.createCell(0);
cell.setCellValue(123.45);
cell = row.getCell(1);
if (cell == null) cell = row.createCell(1);
cell.setCellValue(678.90);
cell = row.getCell(2);
if (cell == null) cell = row.createCell(2);
cell.setCellFormula("A1+B1");
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
//evaluator.evaluateAll(); //will not work because external workbook for formula in A2 is not accessable
System.out.println(sheet.getRow(1).getCell(0)); //[1]Sheet1!$A$1
//but we surely can evaluate single cells:
cell = wb.getSheetAt(0).getRow(0).getCell(2);
System.out.println(evaluator.evaluate(cell).getNumberValue()); //802.35
FileOutputStream fileOut = new FileOutputStream("workbook.xlsx");
wb.write(fileOut);
fileOut.flush();
fileOut.close();
} catch (InvalidFormatException ifex) {
} catch (FileNotFoundException fnfex) {
} catch (IOException ioex) {
}
}
}