In order to do some statistical analysis I need to extract values in a column of an Excel sheet. I have been using the Apache POI package to read from Excel files, and it works fine when one needs to iterate over rows. However I couldn't find anything about getting columns neither in the API (link text) nor through google searching.
As I need to get max and min values of different columns and generate random numbers using these values, so without picking up individual columns, the only other option is to iterate over rows and columns to get the values and compare one by one, which doesn't sound all that time-efficient.
Any ideas on how to tackle this problem?
Thanks,
Excel files are row based rather than column based, so the only way to get all the values in a column is to look at each row in turn. There's no quicker way to get at the columns, because cells in a column aren't stored together.
Your code probably wants to be something like:
List<Double> values = new ArrayList<Double>();
for(Row r : sheet) {
Cell c = r.getCell(columnNumber);
if(c != null) {
if(c.getCellType() == Cell.CELL_TYPE_NUMERIC) {
valuesadd(c.getNumericCellValue());
} else if(c.getCellType() == Cell.CELL_TYPE_FORMULA && c.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC) {
valuesadd(c.getNumericCellValue());
}
}
}
That'll then give you all the numeric cell values in that column.
Just wanted to add, in case you have headers in your file and you are not sure about the column index but want to pick columns under specific headers (column names) for eg, you can try something like this
for(Row r : datatypeSheet)
{
Iterator<Cell> headerIterator = r.cellIterator();
Cell header = null;
// table header row
if(r.getRowNum() == 0)
{
// getting specific column's index
while(headerIterator.hasNext())
{
header = headerIterator.next();
if(header.getStringCellValue().equalsIgnoreCase("column1Index"))
{
column1Index = header.getColumnIndex();
}
}
}
else
{
Cell column1Cells = r.getCell(column1);
if(column1Cells != null)
{
if(column1Cells.getCellType() == Cell.CELL_TYPE_NUMERIC)
{
// adding to a list
column1Data.add(column1Cells.getNumericCellValue());
}
else if(column1Cells.getCellType() == Cell.CELL_TYPE_FORMULA && column1Cells.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC)
{
// adding to a list
column1Data.add(column1Cells.getNumericCellValue());
}
}
}
}
Related
I'm getting a weird error while trying to read the Cell values through Apache POI in java:
System.out.println(row.getCell(13, Row.CREATE_NULL_AS_BLANK).getStringCellValue())
is always printing null, even after specifying the Missing policy as Row.CREATE_NULL_AS_BLANK.My writing logic to the Cell is :
public void writeCell( String value, Sheet sheet, int rowNum, int colNum)
{
Row row = sheet.getRow(rowNum);
if (row == null)
{
row = sheet.createRow(rowNum);
}
Cell cell = row.createCell(colNum, Cell.CELL_TYPE_STRING);
if (value == null)
{
return;
}
cell.setCellValue(value);
}
When I'm writing to Cell at colNum = 13 , the String value object is null. I'm not able to sort out this issue.
This line doesn't do what you seem to think it does:
System.out.println(row.getCell(13, Row.CREATE_NULL_AS_BLANK).getStringCellValue())
In effect, that's doing
Cell cell = row.getCell(13);
if (cell == null) { cell = row.createCell(13, Cell.CELL_TYPE_BLANK); }
So, if there is nothing in that cell, it creates it as an empty blank one
Then, you try doing:
cell.getStringCellValue()
This only works for String cells, and in the missing case you've told POI to give you a Blank new cell!
If you really just want a string value of a cell, use DataFormatter.formatCellValue(Cell) - that returns a String representation of your cell including formatting. Otherwise, check the type of your cell before trying to fetch the value!
The getStringCellValue() on the Cell interface would return "" if your code worked as supposed (setting the call blank).
Is it not possible that value for col id 13 is not null but "null"?
I am having an excel file and i am using apache-poi to get data of the excel file so if i read any cell value then how to know if the cell is a merged cell and get the value of the merged cell
So i am trying to know weather a cell is a merged cell or not if it is a merged cell i will try to get value of first row and first colon value of merged cell and if it is not a merged cell then i will directly get data of the value
like
String var = String.valueOf(sheet.getRow(Row).getCell(Cell));
Two key methods you need:
Sheet.getMergedRegions()
CellRangeAddressBase.isInRange(row,column) (merged regions extend from this)
Your code would just be something like:
public CellRangeAddress getMergedRegionForCell(Cell c) {
Sheet s = c.getRow().getSheet();
for (CellRangeAddress mergedRegion : s.getMergedRegions()) {
if (mergedRegion.isInRange(c.getRowIndex(), c.getColumnIndex())) {
// This region contains the cell in question
return mergedRegion;
}
}
// Not in any
return null;
}
Then check if you get null back, if not read the first row and column of the region to know the top left cell of the region containing your cell of interest
You can use Sheet.getMergedRegions() to determine all ranges of merged cells. Then you can use CellRangeAddress.isInRange(row,column) on the returned ranges to check if the cell in question is a merged cell.
public boolean isMergedCell(int row, int column) {
for (CellRangeAddress range : sheet.getMergedRegions()) {
if (range.isInRange(row, column)) {
return true;
}
}
return false;
}
This is my code (Kotlin) about how to get excel cell data as string taking into account that cell may be merged. I use this code for loading excel data into database.
fun getCellStr(sheet: XSSFSheet, cell: Cell): String {
var res = ""
val formatter = DataFormatter()
var inRange = false
for (range in sheet.getMergedRegions()) {
if (range.isInRange(cell.rowIndex, cell.columnIndex)) {
for (rIndex in range.firstRow..range.lastRow) {
for (cIndex in range.firstColumn..range.lastColumn) {
res = "$res${formatter.formatCellValue(sheet.getRow(rIndex).getCell(cIndex))}"
}
}
inRange = true
}
}
if (!inRange) {
res = formatter.formatCellValue(cell)
}
return res.trim()
}
Is there a way to get the cell object or coordinate by the data the cell contains?
For example if the cell with coordinates (1;5) contains the string "FINDME", i'd like to do something like Workbook.GetCellByData("FINDME") and it should return the Cell object or (1;5).
I have found a code snippet on the Apache POI website that could be useful. I could just read the whole workbook and find the data with an IF-statement, but that's kind of dirty...
EDIT:
I have coded the "dirty" solution as follows:
public Cell getCellByContent(String data) {
for (Row row : wb.getSheetAt(0)) {
for (Cell cell : row) {
if (cell.getCellType() == Cell.CELL_TYPE_STRING){
System.out.println(String.format("Found String type at (%s,%s) and read: %s", row.getRowNum(), cell.getColumnIndex(), cell.getStringCellValue()));
if (cell.getStringCellValue() == data) { //HERE
return cell; //HERE
} //HERE
}
}
}
System.out.println("Can't find it bruh!");
return null;
For some reason it fails at the if-statement. Id like to get the Cell with the content "%title%".
Output:
Found String type at (0,0) and read: %title% <------ IT'S RIGHT HERE!
Found String type at (2,0) and read: Test Information
...
Can't find it bruh!
Does someone have an idea why this is not working?
To fix the dirty solution replace
if (cell.getStringCellValue() == data)
with
if (cell.getStringCellValue().equals(data))
I think I can help you. you just make two for() loops for rows and columns and then type Workbook.getCellValue(i,j) (i is the number of the row and j is the number of the column
I have an excel file with 3000 rows. I remove the 2000 (with ms excel app), but when i call the sheet.getLastRowNum() from code , it gives me 3000 (instead of 1000).. How can i remove the blank rows?
I tried the code from here but it doesn't works....
There are two ways for it:
1.) Without code:
Copy the content of your excel and paste it in a new excel, and later rename is as required.
2.) With code(I did not find any functions for it so I created my own function):
You need to check each of the cells for any type of blank/empty string/null kind of things.
Before processing the row(I am expecting you are processing row wise also I am using org.apache.poi.xssf.usermodel.XSSFRow), put a if check, and check for this method's return type in the if(condition), if it is true that means the row(XSSFRow) has some value other wise move the iterator to next row
public boolean containsValue(XSSFRow row, int fcell, int lcell)
{
boolean flag = false;
for (int i = fcell; i < lcell; i++) {
if (StringUtils.isEmpty(String.valueOf(row.getCell(i))) == true ||
StringUtils.isWhitespace(String.valueOf(row.getCell(i))) == true ||
StringUtils.isBlank(String.valueOf(row.getCell(i))) == true ||
String.valueOf(row.getCell(i)).length() == 0 ||
row.getCell(i) == null) {}
else {
flag = true;
}
}
return flag;
}
So finally your processing method will look like
.
.
.
int fcell = row.getFirstCellNum();// first cell number of excel
int lcell = row.getLastCellNum(); //last cell number of excel
while (rows.hasNext()) {
row = (XSSFRow) rows.next();//increment the row iterator
if(containsValue(row, fcell, lcell) == true){
.
.
..//processing
.
.
}
}
Hope this will help. :)
I haven't found any solution on how to easily get the "real" number of rows but I've found a solution to remove such rows which might be useful to someone who's tackling similar issue. See bellow.
I've searched a bit and found this solution
All it does is it deletes those empty rows from the bottom which might be exactly what you want.
As per my understanding for deleting rows you Must have selected all the cells and pressed Delete button. If I am right then you have deleted the rows by wrong way. By this way the cells become blank not deleted so the rows actually contain cells with blank values and that is why get included in the row count.
The correct way to do this is select the row from the left of its first cell where row numbers are appearing. Clicking there on row numbers will select the complete row. Select all required rows with the help of shift key. Now right click and then select delete.
This may be helpful for you.
remove rows/columns by poi api
transfer xls to csv
transfer csv to xls
hope this will help you
I'm opening a Excel (xls) file in my Java Application with POI.
There are 30 Lines in this Excelfile.
I need to get the Value at ColumnIndex 9.
My code:
Workbook wb;
wb = WorkbookFactory.create(inp);
Sheet sheet = wb.getSheetAt(0);
for (Row row : sheet) {
if (row.getLastCellNum() >= 6) {
for (Cell cell : row) {
if(cell.getColumnIndex == 9) {
//do something
}
}
}
}
Every Row in Excel has Values in Columns 1-14.
My problem is, only some Values are recognized. I wrote the same value in every cell in ColumnIndex 9 (10th Column in my Excel sheet), but the Problem is still the same.
What could cause this problem?
Make sure you set the same Date format for all cells in column (select column and set format explicity) And i belive using DataUtil class to get data is more appropriate, than call cell.getDateCellValue().
POI uses 0 based counting for columns. So, if you want the 9th Column, you need to fetch the cell with index 8, not 9. It looks like you're checking for column with index 9, so are one column out.
If you're not sure about 0 based indexing, then the safest thing is to use the CellReference class to help you. This will translate between Excel style references, eg A1, and POI style 0-based offsets eg 0,0. Use something like:
CellReference ref = new CellReference("I10");
Row r = sheet.getRow(ref.getRow());
if (r == null) {
// That row is empty
} else {
Cell c = r.getCell(ref.getCol());
// c is now the cell at I10
}
Seems to be a Problem with the excel document(s).
Converting them to csv and then back to xls solves the problem.