I am trying to get specific data from an excel sheet, The data is dynamic. It can be anything really. The column headers are the only things i can use for placeholders, but the column header positions can vary in the sheet.
For example i have a sheet like this :
|Name| Surname| Value|
|bar | poo | 5|
|baz | foo | 7|
But for example i need to traverse the sheet to get the surname column and then if i find surname = 'poo' i must then pull its corresponding value which in the sheet is in the next colum but this is dynamic. The surname and value column arent always next to each other, they can be in any position at the top. But if i find a specific 'thing' in the surname column i need to pull its value.
I have managed to traverse through the sheet and store all the data in a 2d array And display that data. from the research ive done , this isnt an efficient approach as traversing and storing large data from sheets can use alot of memory. Ive read that you can read through an excel sheet and instead of storing those values in an array you can write them immediately to another sheet, if they match a certain condition. EG: (pseudo) If(columnheader == surname && surname == foo )then get corresponding value, then write that value to a new sheet.
Okay so my questions are :
1.How do i achieve iterating through the sheet not storing it in an array and writing it straight to another sheet if it matches a condition?
2.From the code i have below, how do i achieve sorting through the data in the array and finding if surname = foo get its corresponding value?
Like i said the data in the sheet is dynamic except for the column headers, but there positions as headers are dynamic.
Sorry for the long post , any help will be greatly appreciated.
package demo.poi;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.math.BigDecimal;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class test {
public static void main(String[] args) throws Exception {
File excel = new File("test.xlsx");
FileInputStream fis = new FileInputStream(excel);
XSSFWorkbook wb = new XSSFWorkbook(fis);
XSSFSheet ws = wb.getSheetAt(0);
ws.setForceFormulaRecalculation(true);
int rowNum = ws.getLastRowNum() + 1;
int colNum = ws.getRow(0).getLastCellNum();
int surnameHeaderIndex = -1, valueHeaderIndex = -1;
//Read the headers first. Locate the ones you need
XSSFRow rowHeader = ws.getRow(0);
for (int j = 0; j < colNum; j++) {
XSSFCell cell = rowHeader.getCell(j);
String cellValue = cellToString(cell);
if("SURNAME".equalsIgnoreCase(cellValue)) {
surnameHeaderIndex = j;
} else if("VALUE".equalsIgnoreCase(cellValue)) {
valueHeaderIndex = j;
}
}
if(surnameHeaderIndex == -1 || valueHeaderIndex == -1) {
throw new Exception("Could not find header indexes\nSurname : " + surnameHeaderIndex + " | Value : " + valueHeaderIndex);
}
//createnew workbook
XSSFWorkbook workbook = new XSSFWorkbook();
//Create a blank sheet
XSSFSheet sheet = workbook.createSheet("data");
for (int i = 1; i < rowNum; i++) {
XSSFRow row = ws.getRow(i);
row = sheet.createRow(rowNum++);
String surname = cellToString(row.getCell(surnameHeaderIndex));
String value = cellToString(row.getCell(valueHeaderIndex));
int cellIndex = 0;
row.createCell(cellIndex++).setCellValue(surname);
row.createCell(cellIndex++).setCellValue(value);
}
FileOutputStream fos = new FileOutputStream(new File("test1.xlsx"));
workbook.write(fos);
fos.close();
}
public static String cellToString(XSSFCell cell) {
int type;
Object result = null;
type = cell.getCellType();
switch (type) {
case XSSFCell.CELL_TYPE_NUMERIC:
result = BigDecimal.valueOf(cell.getNumericCellValue())
.toPlainString();
break;
case XSSFCell.CELL_TYPE_STRING:
result = cell.getStringCellValue();
break;
case XSSFCell.CELL_TYPE_BLANK:
result = "";
break;
case XSSFCell.CELL_TYPE_FORMULA:
result = cell.getCellFormula();
}
return result.toString();
}
}
Something like this should be a good starting point.
Basically you parse the first row, where the headers are located.
You find the position of the headers you want and keep them.
In this example there are only two headers (surname, value) that are needed so I just keep two variables. If there are more, then the solution would be to keep the position of those headers in a HashMap, where the key is the name of the header. After that an iteration of the rows begins. The program parses the values of the columns that are needed (row.getCell(index)). Now you have the values that you need, and only them. You can do whatever you want, you can print them or write a file or whatnot.
Here is an example. The error handling is up to you. This is only an example.
package POIParser;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.math.BigDecimal;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class MainPoi {
public static void main(String[] args) throws Exception {
File excel = new File("test.xlsx");
FileInputStream fis = new FileInputStream(excel);
XSSFWorkbook wb = new XSSFWorkbook(fis);
XSSFSheet ws = wb.getSheetAt(0);
ws.setForceFormulaRecalculation(true);
int rowNum = ws.getLastRowNum() + 1;
int colNum = ws.getRow(0).getLastCellNum();
int surnameHeaderIndex = -1, valueHeaderIndex = -1;
// Read the headers first. Locate the ones you need
XSSFRow rowHeader = ws.getRow(0);
for (int j = 0; j < colNum; j++) {
XSSFCell cell = rowHeader.getCell(j);
String cellValue = cellToString(cell);
if ("SURNAME".equalsIgnoreCase(cellValue)) {
surnameHeaderIndex = j;
} else if ("VALUE".equalsIgnoreCase(cellValue)) {
valueHeaderIndex = j;
}
}
if (surnameHeaderIndex == -1 || valueHeaderIndex == -1) {
throw new Exception("Could not find header indexes\nSurname : "
+ surnameHeaderIndex + " | Value : " + valueHeaderIndex);
}
// createnew workbook
XSSFWorkbook workbook = new XSSFWorkbook();
// Create a blank sheet
XSSFSheet sheet = workbook.createSheet("data");
for (int i = 1; i < rowNum; i++) {
XSSFRow row = ws.getRow(i);
String surname = cellToString(row.getCell(surnameHeaderIndex));
String value = cellToString(row.getCell(valueHeaderIndex));
int cellIndex = 0;
//Create a newRow object for the output excel.
//We begin for i = 1, because of the headers from the input excel, so we go minus 1 in the new (no headers).
//If for the output we need headers, add them outside this for loop, and go with i, not i-1
XSSFRow newRow = sheet.createRow(i-1);
newRow.createCell(cellIndex++).setCellValue(surname);
newRow.createCell(cellIndex++).setCellValue(value);
}
FileOutputStream fos = new FileOutputStream(new File("test1.xlsx"));
workbook.write(fos);
fos.close();
}
public static String cellToString(XSSFCell cell) {
int type;
Object result = null;
type = cell.getCellType();
switch (type) {
case XSSFCell.CELL_TYPE_NUMERIC:
result = BigDecimal.valueOf(cell.getNumericCellValue())
.toPlainString();
break;
case XSSFCell.CELL_TYPE_STRING:
result = cell.getStringCellValue();
break;
case XSSFCell.CELL_TYPE_BLANK:
result = "";
break;
case XSSFCell.CELL_TYPE_FORMULA:
result = cell.getCellFormula();
}
return result.toString();
}
}
Related
I am having 100 excel files and I want to merge all of them into one excel file. Here in my example I am having 2 excel files and I want to merge them into one. I can't do it. I am using Apache POI API.
In one excel workbook there can be more than one sheets also so I want to iterate through sheets of each workbook also.
I tried and researched but I got this link and it's not working for me
https://dev.to/eiceblue/merge-excel-files-in-java-2lo2#:~:text=A%20quick%20way%20to%20merge,data%20table%20into%20another%20worksheet.
Please help me out here.
package com.cas.ExcelTest;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Iterator;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class Combine {
public static void main(String args[]) {
String[] files = new String[] {"Test2.xlsx","Test3.xlsx"};
XSSFWorkbook workbook = new XSSFWorkbook();
try {
for (int f = 0; f < files.length; f++) {
String file = files[f];
FileInputStream inputStream = new FileInputStream(file);
XSSFWorkbook tempWorkbook = new XSSFWorkbook(inputStream);
int numOfSheets = tempWorkbook.getNumberOfSheets();
for (int i = 0; i < numOfSheets; i++) {
XSSFSheet tempSheet = tempWorkbook.getSheetAt(i);
String newSheetName = ""+f+""+tempSheet.getSheetName();
XSSFSheet sheet = workbook.createSheet(newSheetName);
Iterator<Row> itRow = tempSheet.rowIterator();
while(itRow.hasNext()) {
Row tempRow = itRow.next();
XSSFRow row = sheet.createRow(tempRow.getRowNum());
Iterator<Cell> itCell = tempRow.cellIterator();
while(itCell.hasNext()) {
Cell tempCell = itCell.next();
XSSFCell cell = row.createCell(tempCell.getColumnIndex());
switch (tempCell.getCellType()) {
case NUMERIC:
cell.setCellValue(tempCell.getNumericCellValue());
break;
case STRING:
cell.setCellValue(tempCell.getStringCellValue());
break;
case BLANK:
break;
case BOOLEAN:
break;
case ERROR:
break;
case FORMULA:
cell.setCellValue(tempCell.getNumericCellValue());
break;
case _NONE:
break;
default:
break;
}
}
}
}
}
} catch (IOException ex1) {
System.out.println("Error reading file");
ex1.printStackTrace();
}
try (FileOutputStream outputStream = new FileOutputStream("result.xlsx")) {
workbook.write(outputStream);
}
catch(Exception ex) {
System.out.println("Something went wrong");
}
}
}
My Excel files:
Test2.xlsx
Test3.xlsx
Here some columns are extra in Test3.xlsx and in both files as you can see in the heading row its all string but after that it has numeric values.
Here you have an approximation of the code you need, format it, extract functionalities to methods and check the naming of sheets.
String[] files = new String[] {"Test2.xlsx","Test3.xlsx"};
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = createSheetWithHeader(workbook);
try {
for (int f = 0; f < files.length; f++) {
String file = files[f];
FileInputStream inputStream = new FileInputStream(file);
XSSFWorkbook tempWorkbook = new XSSFWorkbook(inputStream);
int numOfSheets = tempWorkbook.getNumberOfSheets();
for (int i = 0; i < numOfSheets; i++) {
XSSFSheet tempSheet = tempWorkbook.getSheetAt(i);
int indexLastDataInserted = sheet.getLastRowNum();
int firstDataRow = getFirstDataRow(tempSheet);
Iterator<Row> itRow = tempSheet.rowIterator();
while(itRow.hasNext()) {
Row tempRow = itRow.next();
if (tempRow.getRowNum() >= firstDataRow) {
XSSFRow row = sheet.createRow(indexLastDataInserted + 1);
Iterator<Cell> itCell = tempRow.cellIterator();
while(itCell.hasNext()) {
Cell tempCell = itCell.next();
XSSFCell cell = row.createCell(tempCell.getColumnIndex());
//At this point you will have to set the value of the cell depending on the type of data it is
switch (tempCell.getCellType()) {
case NUMERIC:
cell.setCellValue(tempCell.getNumericCellValue());
break;
case STRING:
cell.setCellValue(tempCell.getStringCellValue());
break;
/**
* Add your other types, here is your problem!!!!!
*/
}
}
}
}
}
}
}catch (IOException ex1) {
System.out.println("Error reading file");
ex1.printStackTrace();
}
try (FileOutputStream outputStream = new FileOutputStream("result.xlsx")) {
workbook.write(outputStream);
}
Function to get the first data row (necessary to avoid having to enter by hand where the header of each excel ends):
/**
* If the tab has a filter, it returns the row index of the filter + 1, otherwise it returns 0
* #param tempSheet
* #return index of first data row
*/
public static Integer getFirstDataRow(XSSFSheet tempSheet) {
Integer result = 0;
Boolean isAutoFilter = tempSheet.getCTWorksheet().isSetAutoFilter();
if (isAutoFilter) {
String autoFilterRef = tempSheet.getCTWorksheet().getAutoFilter().getRef();
result = new CellReference(autoFilterRef.substring(0, autoFilterRef.indexOf(":"))).getRow() + 1;
}
return result;
}
Create the sheet with header in the method:
public static XSSFSheet createSheetWithHeader(XSSFWorkbook workbook){
XSSFSheet sheet = workbook.createSheet("NEW_SHEET_NAME");
//Implement the header
[...]
return sheet;
}
I need to read the excel file, so I can reference the column index by name, and I do like that :
package main;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
public class ReadExcel {
public static void main(String[] args) {
try {
InputStream fs = new FileInputStream("/.../ListProducts.xls");
HSSFWorkbook wb = new HSSFWorkbook(fs);
HSSFSheet sheet = wb.getSheetAt(0);
Map<String, Integer> map = new HashMap<String,Integer>(); //Create map
HSSFRow row = sheet.getRow(0); //Get first row
//following is boilerplate from the java doc
short minColIx = row.getFirstCellNum(); //get the first column index for a row
short maxColIx = row.getLastCellNum(); //get the last column index for a row
for(short colIx=minColIx; colIx<maxColIx; colIx++) { //loop from first to last index
HSSFCell cell = row.getCell(colIx); //get the cell
map.put(cell.getStringCellValue(),cell.getColumnIndex()); //add the cell contents (name of column) and cell index to the map
}
List<ReportRow> listOfDataFromReport = new ArrayList<ReportRow>();
for(int x = 1; x<=sheet.getPhysicalNumberOfRows(); x++){
ReportRow rr = new ReportRow();
HSSFRow dataRow = sheet.getRow(x);
int idxForColumn1 = map.get("Id");
int idxForColumn2 = map.get("Name");
int idxForColumn3 = map.get("Price");
HSSFCell cell1 = dataRow.getCell(idxForColumn1);
HSSFCell cell2 = dataRow.getCell(idxForColumn2);
HSSFCell cell3 = dataRow.getCell(idxForColumn3);
rr.setColumn1(cell1.getStringCellValue());
rr.setColumn2(cell2.getStringCellValue());
rr.setColumn3(cell3.getStringCellValue());
listOfDataFromReport.add(rr);
}
for(int j = 0; j< listOfDataFromReport.size();j++){
System.out.println("Column 1 Value: " + listOfDataFromReport.get(j).getColumn1());
//etc...
}
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
When I run the program, I get this in output :
null
I have already the excel file in the correct destination and with the right name of column.
EDIT
When I add e.printStackTrace();, I get this:
java.lang.NullPointerException
at main.ReadExcel.main(ReadExcel.java:51)
null
EDIT2
I notice that variables of setColumn1,setColumn2,... methods are double.
I do this :
rr.setColumn1(cell1.getNumericCellValue());
rr.setColumn2(cell2.getNumericCellValue());
rr.setColumn3(cell3.getNumericCellValue());
I get the following error when I try to run the program to read data:
Exception in thread "main" java.lang.IllegalStateException: Cannot get
a NUMERIC value from a STRING cell at
org.apache.poi.hssf.usermodel.HSSFCell.typeMismatch(HSSFCell.java:654)
at org.apache.poi.hssf.usermodel.HSSFCell.getNumericCellValue(HSSFCell.java:679)
Well I got kinda different opinion than #maytham-ɯɐɥʇʎɐɯ. Assuming names of the columns are correct in your code, I think that this may be the problem:
for(int x = 1; x<=sheet.getPhysicalNumberOfRows(); x++){
Let's say number of physical rows is 3, so they are numbered 0, 1, 2 and you are iterating till 3, thus when you trying to get the cell later it throws the NullPointerException, because there is no such row.
Try doing x < sheet.getPhysicalNumberOfRows(); (less than, not less/equal than) in the for loop I mentioned earlier.
I have a large excel file. I want to filter a column "Mainly used for" for values "mainly used for mobile". Then I need to store the corresponding values in the "Number Series" column in a list. I have a code to start with. However I am not able to do the filtering part and storing it to an array list. Could you please help me out here.
I did some digging and have modified the code. However I have not been able to meet my requirement. I have following problems -
*The code only selects two columns and displays their contents. Not able to filter :(
*The excel has column names with spaces. So I am getting the error. As the excel is generated by the user,
we have no control over column names. How to deal with the column name with spaces ??
*Excel has alpha-numeric values, how to deal with them?
Could you please help me out here.
package com.excel;
import java.io.File;
import java.io.FileInputStream;
import java.math.BigDecimal;
import java.io.FileOutputStream;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;*/
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFSheet;
public class Test {
public static void main(String[] args) throws Exception {
File excel = new File("D:\\FileDownload\\example.xls");
//File excel = new File("D:\\FileDownload\\Sample_Filtered.xls");
FileInputStream fis = new FileInputStream(excel);
//XSSFWorkbook wb = new XSSFWorkbook(fis);
HSSFWorkbook wb = new HSSFWorkbook(fis);
//org.apache.poi.ss.usermodel.Workbook wb = WorkbookFactory.create(fis);
HSSFSheet ws = wb.getSheetAt(0);
// org.apache.poi.ss.usermodel.Sheet ws = wb.getSheetAt(0);
ws.setForceFormulaRecalculation(true);
int rowNum = ws.getLastRowNum() + 1;
int colNum = ws.getRow(0).getLastCellNum();
int mainlyUsedForHeaderIndex = -1, mobileSeriesHeaderIndex = -1;
//Read the headers first. Locate the ones you need
HSSFRow rowHeader = ws.getRow(0);
for (int j = 0; j < colNum; j++) {
HSSFCell cell = rowHeader.getCell(j);
String cellValue = cellToString(cell);
if("Mainly used for".equalsIgnoreCase(cellValue)) {
//if("MainlyFor".equalsIgnoreCase(cellValue)) {
mainlyUsedForHeaderIndex = j;
} else if("Number Series".equalsIgnoreCase(cellValue)) {
//else if("MobileSeries".equalsIgnoreCase(cellValue)) {
mobileSeriesHeaderIndex = j;
}
}
if(mainlyUsedForHeaderIndex == -1 || mobileSeriesHeaderIndex == -1) {
throw new Exception("Could not find header indexes\n Mainly used for : " + mainlyUsedForHeaderIndex + " | Number Series: " + mobileSeriesHeaderIndex);
}else{
System.out.println("Indexes are found!!!");
}
//createnew workbook
XSSFWorkbook workbook = new XSSFWorkbook();
//Create a blank sheet
XSSFSheet sheet = workbook.createSheet("data");
for (int i = 1; i < rowNum; i++) {
HSSFRow row = ws.getRow(i);
//row = sheet.createRow(rowNum++);
String MainlyUsed = cellToString(row.getCell(mainlyUsedForHeaderIndex));
String ForMobile = cellToString(row.getCell(mobileSeriesHeaderIndex));
int cellIndex = 0;
XSSFRow newRow = sheet.createRow(i-1);
newRow.createCell(cellIndex++).setCellValue(MainlyUsed);
newRow.createCell(cellIndex++).setCellValue(ForMobile );
}
FileOutputStream fos = new FileOutputStream(new File("D:\\FileDownload\\test1.xlsx"));
System.out.println("File generated");
workbook.write(fos);
fos.close();
}
public static String cellToString(HSSFCell cell) {
int type;
Object result = null;
type = cell.getCellType();
switch (type) {/*
case HSSFCell.CELL_TYPE_NUMERIC:
result = BigDecimal.valueOf(cell.getNumericCellValue())
.toPlainString();
break;
case HSSFCell.CELL_TYPE_STRING:
result = cell.getStringCellValue();
break;
case HSSFCell.CELL_TYPE_BLANK:
result = "";
break;
case HSSFCell.CELL_TYPE_FORMULA:
result = cell.getCellFormula();*/
case HSSFCell.CELL_TYPE_BLANK:
result="";
break;
case HSSFCell.CELL_TYPE_BOOLEAN:
//
result = cell.getBooleanCellValue();
break;
case HSSFCell.CELL_TYPE_ERROR:
//
break;
case HSSFCell.CELL_TYPE_FORMULA:
result = cell.getCellFormula();
break;
case HSSFCell.CELL_TYPE_NUMERIC:
//
result = cell.getNumericCellValue();
break;
case HSSFCell.CELL_TYPE_STRING:
result= cell.getRichStringCellValue();
// result = cell.getStringCellValue();
break;
}
return result.toString();
}
}
I am able to meet my requirement using following entirely different approach.
package com.excel;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
public class ExcelRead {
public static void main(String[] args) throws IOException{
String fileName = "D:\\FileDownload\\example.xls";
String cellContent = "Mainly used for mobile";
int rownr=0;
int colnr = 0; //column from which you need data to store in array list
InputStream input = new FileInputStream(fileName);
HSSFWorkbook wb = new HSSFWorkbook(input);
HSSFSheet sheet = wb.getSheetAt(0);
List MobileSeries=new ArrayList();
MobileSeries = findRow(sheet, cellContent);
if(MobileSeries !=null){
for(Iterator iter=MobileSeries.iterator();iter.hasNext();){
System.out.println(iter.next());
}
}
//output(sheet, rownr, colnr);
finish();
}
private static void output(HSSFSheet sheet, int rownr, int colnr) {
/*
* This method displays the total value of the month
*/
HSSFRow row = sheet.getRow(rownr);
HSSFCell cell = row.getCell(colnr);
System.out.println("Your total is: " + cell);
}
private static List findRow(HSSFSheet sheet, String cellContent) {
List MobileSeries=new ArrayList();
for (Row row : sheet) {
for (Cell cell : row) {
if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
if (cell.getRichStringCellValue().getString().trim().equals(cellContent)) {
//System.out.println("Row numbers are"+row.getRowNum());
int rownumber=row.getRowNum();
//return row.getRowNum();
HSSFRow row1 = sheet.getRow(rownumber);
HSSFCell cell1 = row1.getCell(0);
MobileSeries.add(cell1);
}
}
}
}
return MobileSeries;
}
private static void finish() {
System.exit(0);
}
}
I need to read specific column of an excel sheet and then declare the variables in java. The program that I have done reads the entire content of excel sheet. But I need to read a fixed column like C.
This is what I have done:
import java.io.File;
import java.io.IOException;
import jxl.Cell;
import jxl.Sheet;
import jxl.Workbook;
import jxl.read.biff.BiffException;
public class JavaApplication4
{
private String inputFile;
String[][] data = null;
public void setInputFile(String inputFile)
{
this.inputFile = inputFile;
}
public String[][] read() throws IOException
{
File inputWorkbook = new File(inputFile);
Workbook w;
try
{
w = Workbook.getWorkbook(inputWorkbook);
// Get the first sheet
Sheet sheet = w.getSheet(0);
data = new String[sheet.getColumns()][sheet.getRows()];
// Loop over first 10 column and lines
// System.out.println(sheet.getColumns() + " " +sheet.getRows());
for (int j = 0; j <sheet.getColumns(); j++)
{
for (int i = 0; i < sheet.getRows(); i++)
{
Cell cell = sheet.getCell(j, i);
data[j][i] = cell.getContents();
// System.out.println(cell.getContents());
}
}
for (int j = 0; j < data.length; j++)
{
for (int i = 0; i <data[j].length; i++)
{
System.out.println(data[j][i]);
}
}
}
catch (BiffException e)
{
e.printStackTrace();
}
return data;
}
public static void main(String[] args) throws IOException
{
JavaApplication4 test = new JavaApplication4();
test.setInputFile("C://users/admin/Desktop/Content.xls");
test.read();
}
}
Here is my excel sheet,
From a bowl of chits numbered /#v1#/ to /#v2#/ , a single chit is randomly drawn. Find the probability that the chit drawn is a number that is a multiple of /#v3#/ or /# v4#/?
I need to read this data and by matching the pattern /#v1#1, I need to declare the variables. How can I do this?
What you can do, you should first get all the columns from the sheet by using sheet.getColumns() and store all columns in a list . Then you can match get all values based on columns. or you can get for only column "C".try using below code. let me know if this works.
int masterSheetColumnIndex = sheet.getColumns();
List<String> ExpectedColumns = new ArrayList<String>();
for (int x = 0; x < masterSheetColumnIndex; x++) {
Cell celll = sheet.getCell(x, 0);
String d = celll.getContents();
ExpectedColumns.add(d);
}
LinkedHashMap<String, List<String>> columnDataValues = new LinkedHashMap<String, List<String>>();
List<String> column1 = new ArrayList<String>();
// read values from driver sheet for each column
for (int j = 0; j < masterSheetColumnIndex; j++) {
column1 = new ArrayList<String>();
for (int i = 1; i < sheet.getRows(); i++) {
Cell cell = sheet.getCell(j, i);
column1.add(cell.getContents());
}
columnDataValues.put(ExpectedColumns.get(j), column1);
}
This is the very simple and efficient code and Working as expected
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
public class TestExcelFile {
public static void main(String[] args) {
String envFilePath = System.getenv("AZURE_FILE_PATH");
// upload list of files/directory to blob storage
File folder = new File(envFilePath);
File[] listOfFiles = folder.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
if (listOfFiles[i].isFile()) {
System.out.println("File " + listOfFiles[i].getName());
Workbook workbook;
//int masterSheetColumnIndex = 0;
try {
workbook = WorkbookFactory.create(new FileInputStream(envFilePath + "\\"+ listOfFiles[i].getName()));
// Get the first sheet.
Sheet sheet = workbook.getSheetAt(0);
//we will search for column index containing string "Your Column Name" in the row 0 (which is first row of a worksheet
String columnWanted = "Column_Name";
Integer columnNo = null;
//output all not null values to the list
List<Cell> cells = new ArrayList<Cell>();
// Get the first cell.
Row row = sheet.getRow(0);
//Cell cell = row.getCell(0);
for (Cell cell : row) {
// Column header names.
//System.out.println(cell.toString());
if (cell.getStringCellValue().equals(columnWanted)){
columnNo = cell.getColumnIndex();
}
}
if (columnNo != null){
for (Row row1 : sheet) {
Cell c = row1.getCell(columnNo);
if (c == null || c.getCellType() == Cell.CELL_TYPE_BLANK) {
// Nothing in the cell in this row, skip it
} else {
cells.add(c);
//System.out.println(c);
}
}
}else{
System.out.println("could not find column " + columnWanted + " in first row of " + listOfFiles[i].getName());
}
} catch (InvalidFormatException | IOException e) {
e.printStackTrace();
}
}
}
}
}
Reading Particular column from excel file
File myFile = new File(path);
FileInputStream fis = new FileInputStream(myFile);
// Finds the workbook instance for XLSX file
XSSFWorkbook myWorkBook = new XSSFWorkbook (fis);
//XSSFWorkbook workBook = new XSSFWorkbook();
//Reading sheet at number 0 in spreadsheet(image attached for reference
Sheet sheet = myWorkBook.getSheetAt(0);
//creating a Sheet object to retrieve object
Iterator<Row> itr = sheet.iterator();//iterating over excel file
while (itr.hasNext())
{
Row row = itr.next();
Iterator<Cell> cellIterator = row.cellIterator();//iterating over each column
//Reading cell in my case column name is ppm
Cell ppmEx= row.getCell(0);
//Cell cell = cellIterator.next();
while (cellIterator.hasNext())
{
Cell cell = cellIterator.next();
//Check the cell type and format accordingly
switch (cell.getCellType())
{
case Cell.CELL_TYPE_NUMERIC:
//System.out.println(cell.getNumericCellValue() + " ");
al.add(cell.getNumericCellValue());
break;
case Cell.CELL_TYPE_STRING:
//System.out.println(cell.getStringCellValue()+" ");
al.add(cell.getStringCellValue());
break;
case Cell.CELL_TYPE_BOOLEAN:
//System.out.println(cell.getBooleanCellValue()+" ");
al.add(cell.getBooleanCellValue());
case Cell.CELL_TYPE_BLANK:
//System.out.println("blank");
al.add("blank");
}
}
System.out.println("-");
}
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package xlsxreader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.*;
/**
*
* #author khaled
*/
public class XlsxReader {
/**
* #param args the command line arguments
*/
public static void main(String[] args) throws FileNotFoundException, IOException, InvalidFormatException {
File file = new File("C:\\Users\\khaled\\Desktop\\myXLSX file.xlsx");
Workbook workbook = WorkbookFactory.create(new FileInputStream(file));
Sheet sheet = workbook.getSheetAt(0);
int column_index_1 = 0;
int column_index_2 = 0;
int column_index_3 = 0;
Row row = sheet.getRow(0);
for (Cell cell : row) {
// Column header names.
switch (cell.getStringCellValue()) {
case "MyFirst Column":
column_index_1 = cell.getColumnIndex();
break;
case "3rd Column":
column_index_2 = cell.getColumnIndex();
break;
case "forth Column":
column_index_3 = cell.getColumnIndex();
break;
}
}
for (Row r : sheet) {
if (r.getRowNum()==0) continue;//hearders
Cell c_1 = r.getCell(column_index_1);
Cell c_2 = r.getCell(column_index_2);
Cell c_3 = r.getCell(column_index_3);
if (c_1 != null && c_1.getCellType() != Cell.CELL_TYPE_BLANK
&&c_2 != null && c_2.getCellType() != Cell.CELL_TYPE_BLANK
&&c_3 != null && c_3.getCellType() != Cell.CELL_TYPE_BLANK) {
System.out.print(" "+c_1 + " " + c_2+" "+c_3+"\n");
}
}
}
}
I have been given a Assignment that I need to Split the data of a Spreadsheet and Write it into the new Spreadsheet. The Conditions are, Given Spreadsheet may have multiple numbers of Merged Cells and I need to find those Merged cells and write those Data in a New SpreadSheet.
ie, the data or cells between one merged cell till to another Merged cell must be written in another Spreadsheet.
My Code of Effort is given below,
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
public class CopyTest {
public static void main(String[] args) throws IOException {
CopyTest excel = new CopyTest();
excel.process("D:\\B3.xls");
}
public void process(String fileName) throws IOException {
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(fileName));
HSSFWorkbook workbook = new HSSFWorkbook(bis);
HSSFWorkbook myWorkBook = new HSSFWorkbook();
HSSFSheet sheet = null;
HSSFRow row = null;
HSSFCell cell = null;
HSSFSheet mySheet = null;
HSSFRow myRow = null;
HSSFCell myCell = null;
int sheets = workbook.getNumberOfSheets();
int fCell = 0;
int lCell = 0;
int fRow = 0;
int lRow = 0;
for (int iSheet = 0; iSheet < sheets; iSheet++) {
sheet = workbook.getSheetAt(iSheet);
if (sheet != null) {
mySheet = myWorkBook.createSheet(sheet.getSheetName());
fRow = sheet.getFirstRowNum();
System.out.println("First Row at"+fRow);
lRow = sheet.getLastRowNum();
for (int iRow = fRow; iRow <= lRow; iRow++) {
row = sheet.getRow(iRow);
myRow = mySheet.createRow(iRow);
if (row != null) {
fCell = row.getFirstCellNum();
lCell = row.getLastCellNum();
for (int iCell = fCell; iCell < lCell; iCell++) {
//if (mySheet.getMergedRegionAt(index)!=null)
System.out.println("Finding next merged Cells");
cell = row.getCell(iCell);
myCell = myRow.createCell(iCell);
if (cell != null) {
myCell.setCellType(cell.getCellType());
switch (cell.getCellType()) {
case HSSFCell.CELL_TYPE_BLANK:
myCell.setCellValue("");
break;
case HSSFCell.CELL_TYPE_BOOLEAN:
myCell.setCellValue(cell.getBooleanCellValue());
break;
case HSSFCell.CELL_TYPE_ERROR:
myCell.setCellErrorValue(cell.getErrorCellValue());
break;
case HSSFCell.CELL_TYPE_FORMULA:
myCell.setCellFormula(cell.getCellFormula());
break;
case HSSFCell.CELL_TYPE_NUMERIC:
myCell.setCellValue(cell.getNumericCellValue());
break;
case HSSFCell.CELL_TYPE_STRING:
myCell.setCellValue(cell.getStringCellValue());
break;
default:
myCell.setCellFormula(cell.getCellFormula());
// System.out.println("Reading Cell value\t"+myCell);
}System.out.println("Reading Cell value\t"+myCell);
}
}
}
}
}
}
bis.close();
BufferedOutputStream bos = new BufferedOutputStream(
new FileOutputStream("D:\\Result Excel1.xls", true));
myWorkBook.write(bos);
bos.close();
}}
With this Code, I have Achieved cloning the spreadsheet into another new Sheet. Here, I am failing to find the merged Cell, getMergedCellRegionAt() helps me and returns merged cell region like A:4 D:12 like that. how do I proceed with this. Kindly Help me, your small effort is appreciated. Thanks in advance.
According to the Javadocs for HSSFSheet, getMergedCellRegionAt() was deprecated in 2008 because the Region it returns was also deprecated, in favor of CellRangeAddress. It suggests that you should use getMergedRegion(int) instead, which returns a CellRangeAddress.
The merged region data is not stored directly with the cells themselves, but with the Sheet object. So you do not need to loop through rows and cells looking for whether they are part of a merged region; you just need to loop through the list of merged regions on the sheet, then add the merged region to your new sheet with addMergedRegion(CellRangeAddress).
for (int i = 0; i < sheet.getNumMergedRegions(); i++)
{
CellRangeAddress mergedRegion = sheet.getMergedRegion(i);
// Just add it to the sheet on the new workbook.
mySheet.addMergedRegion(mergedRegion);
}
These methods on HSSFSheet are in the Sheet interface, so they will work with any Excel workbook that Apache POI supports, .xls (HSSF) or .xlsx (XSSF).
The merged cells have their value in the first cell.
The following method returns the value of the region provided the first cell's row and column in the merged region
String getMergedRegionStringValue(HSSFSheet sheet, int firstRow, int firstColumn){
for(int i = 0; i < sheet.getNumMergedRegions(); i++) {
CellRangeAddress region = sheet.getMergedRegion(i);
int colIndex = region.getFirstColumn();
int rowNum = region.getFirstRow();
//check first cell of the region
if(rowNum == firstRow && colIndex == firstColumn){
return sheet.getRow(rowNum).getCell(colIndex).getStringCellValue();
}
}
}
rgettmans answer is completely correct.
Java 8
I just wanted to add a solution for Java 8 with streams:
Sheet oldSheet, newSheet;
IntStream.range(0, oldSheet.getNumMergedRegions())
.mapToObj(oldSheet::getMergedRegion)
.forEach(newSheet::addMergedRegion);