I want to compare Excel files with each other just to see if they are the same or not. I can choose my Excel Files and Read them. I have 2 Excel Sheets with the same Content but one in .xls and on in .xlsx format.
I use the following Code to read my files (for xls with HSSFWorkbook and so on)
private String xlsx(File inputFile) {
String outputString = "";
// For storing data into String
StringBuffer data = new StringBuffer();
try {
// Get the workbook object for XLSX file
XSSFWorkbook wBook = new XSSFWorkbook(new FileInputStream(inputFile));
// Get first sheet from the workbook
XSSFSheet sheet = wBook.getSheetAt(0);
Row row;
Cell cell;
// Iterate through each rows from first sheet
Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext()) {
row = rowIterator.next();
// For each row, iterate through each columns
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
cell = cellIterator.next();
data.append(cell + ";");
}
data.append("\n");
}
System.out.println(data.toString());
outputString = data.toString();
wBook.close();
} catch (Exception ioe) {
ioe.printStackTrace();
}
return outputString;
}
In my Excel I have blank cells - when i read them with xls I get DATA;;;;;DATA which is correct but When i Do the same in xlsx I get DATA;DATA
Somehow the Code skips empty cells?! How can I fix this Problem?
Thanks in Advance
After some more Google research and trying different things i found a solution to my Problem. The Iterator skips empty Cells because they have no value - they are null - however in a xls File it seems like they are not null - Whatever
My Code:
private String xlsx(File inputFile) {
String outputString = "";
System.out.println("start");
// For storing data into String
StringBuffer data = new StringBuffer();
try {
// Get the workbook object for XLSX file
XSSFWorkbook wBook = new XSSFWorkbook(new FileInputStream(inputFile));
// Get first sheet from the workbook
XSSFSheet sheet = wBook.getSheetAt(0);
// Decide which rows to process
int rowStart = 0;
int rowEnd = sheet.getLastRowNum()+1;
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
Row r = sheet.getRow(rowNum);
int lastColumn = r.getLastCellNum();
for (int cn = 0; cn < lastColumn; cn++) {
Cell c = r.getCell(cn);
if (c == null) {
data.append("" + ";");
} else {
data.append(c + ";");
}
}
data.append("\n");
}
System.out.println(data.toString());
outputString = data.toString();
wBook.close();
} catch (Exception ioe) {
ioe.printStackTrace();
}
System.out.println("end");
return outputString;
}
Related
I have one excel file with 4 different sheets to be read for my project. All 4 sheets contain different headers and different number of columns. When I delete all the headers and make everything look same by having same number of columns the code works. But I have no authority to modify the excel sheet as I wish.
Please somebody suggest me a way how to make the excel file to be read with headers even with different number of columns. Here is my code to read excel file:
public class ReadExcelFileAndStore {
public List getTheFileAsObject(String filePath){
List <Employee> employeeList = new ArrayList<>();
try {
FileInputStream file = new FileInputStream(new File(filePath));
// Get the workbook instance for XLS file
HSSFWorkbook workbook = new HSSFWorkbook(file);
int numberOfSheets = workbook.getNumberOfSheets();
//System.out.println(numberOfSheets);
//loop through each of the sheets
for(int i = 0; i < numberOfSheets; i++) {
// Get first sheet from the workbook
HSSFSheet sheet = workbook.getSheetAt(i);
String sheetName = workbook.getSheetName(i);
// Iterate through each rows from first sheet
Iterator <Row> rowIterator = sheet.rowIterator();
Row headerRow= rowIterator.next();
while (rowIterator.hasNext()) {
// Get Each Row
Row row = rowIterator.next();
// For each row, iterate through each columns
Iterator<Cell> cellIterator = row.cellIterator();
Employee employee = new Employee();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
int columnIndex = cell.getColumnIndex();
switch (columnIndex + 1) {
case 1:
employee.setEmpName(cell.getStringCellValue());
break;
case 2:
employee.setExtCode((int) cell.getNumericCellValue());
break;
}
}
employeeList.add(employee);
}
}
file.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return employeeList;
}
}
Please excuse me if I am not clear. English is not my first language.
I'm trying to write a code where I can traverse through the first row of an excel file until I find the column labeled 'Comments'. I want to run some action on the text in that column and then save the result in a new column at the end of the file. Can I traverse the xlsx file in a manner similar to indexes? And if so, how can I jump straight to a cell using that cell's coordinates?
public static void main(String[] args) throws IOException {
File myFile = new File("temp.xlsx");
FileInputStream fis = null;
try {
fis = new FileInputStream(myFile);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
#SuppressWarnings("resource")
XSSFWorkbook myWorkBook = new XSSFWorkbook (fis);
XSSFSheet mySheet = myWorkBook.getSheetAt(0);
Iterator<Row> rowIterator = mySheet.iterator();
Row row = rowIterator.next();
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
String comment = cell.toString();
if (comment.equals("Comments"))
{
System.out.println("Hello");
}
}
}
}
For the question "Wanted to go to the second column's 3rd row I could use coordinates like (3, 2)?":
Yes this is possible using CellUtil. Advantages over the methods in Sheet and Row are that CellUtil methods are able getting the cell if it exists already or creating the cell if it not already exists. So existing cells will be respected instead simply new creating them and so overwriting them.
Example:
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.ss.util.CellUtil;
import java.util.concurrent.ThreadLocalRandom;
public class CreateExcelCellsByIndex {
public static void main(String[] args) throws Exception {
Workbook workbook = new XSSFWorkbook();
Sheet sheet = workbook.createSheet();
//put content in R3C2:
Cell cell = CellUtil.getCell(CellUtil.getRow(3-1, sheet), 2-1); //-1 because apache poi's row and cell indexes are 0 based
cell.setCellValue("R3C2");
//put content in 10 random cells:
for (int i = 1; i < 11; i++) {
int r = ThreadLocalRandom.current().nextInt(4, 11);
int c = ThreadLocalRandom.current().nextInt(1, 6);
cell = CellUtil.getCell(CellUtil.getRow(r-1, sheet), c-1);
String cellcontent = "";
if (cell.getCellTypeEnum() == CellType.STRING) {
cellcontent = cell.getStringCellValue() + " ";
}
cell.setCellValue(cellcontent + i + ":R"+r+"C"+c);
}
workbook.write(new FileOutputStream("CreateExcelCellsByIndex.xlsx"));
workbook.close();
}
}
FileInputStream file = new FileInputStream(new File(fileLocation));
Workbook workbook = new XSSFWorkbook(file);
Sheet sheet = workbook.getSheetAt(0);
Map<Integer, List<String>> data = new HashMap<>();
int i = 0;
for (Row row : sheet) {
data.put(i, new ArrayList<String>());
for (Cell cell : row) {
switch (cell.getCellTypeEnum()) {
case STRING: ... break;
case NUMERIC: ... break;
case BOOLEAN: ... break;
case FORMULA: ... break;
default: data.get(new Integer(i)).add(" ");
}
}
i++;
}
I'm not sure what you mean by 2D index, but a Cell knows which column it belongs to so something like this should work:
...
Cell cell = cellIterator.next();
String comment = cell.toString();
int sourceColumnIndex = -1;
if (comment.equals("Comments")) {
System.out.println("Hello");
sourceColumnIndex = cell.getColumnIndex();
}
....
Similarly, define something like int targetColumnIndex to represent the column which will have the result from processing all the cells from the sourceColumnIndex column.
i have an excel file with 3 columns. i already store the "B" columns to array list and check it if the value is duplicate or not. now i have problem to write the "Duplicate" value to "C" columns. how to write on specific columns?
here is my code
FileInputStream file = new FileInputStream(new File(
"file name"));
XSSFWorkbook workbook = new XSSFWorkbook(file);
XSSFSheet sheet = workbook.getSheetAt(0);
Iterator<Row> rowIterator = sheet.iterator();
ArrayList<String> col = new ArrayList<String>();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
System.out.println(row.getRowNum());
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
if(cell.getColumnIndex()==1) {
col.add(cell.getStringCellValue());
System.out.print(cell.toString());
}
}
System.out.println();
}
for(int a = 0; a < 14; a++) {
if(col.get(a).equals("Order ID")) {
if(col.get(a).equals(col.get(a+1))) {
System.out.println("ROW no "+a+"Double Order");
}
} else {
if(col.get(a).equals(col.get(a+1)) || col.get(a).equals(col.get(a-1))) {
if(col.get(a).trim().length()>0) {
System.out.println("ROW no "+a+"Double Order");
col.add("Double");
}
}
}
}
FileOutputStream fileOut = new FileOutputStream("file name");
workbook.write(fileOut);
fileOut.close();
You do not need to iterate twice to identify and write against duplicate rows separately. You can do it like following:
private static void identifyDuplicateOrders() throws IOException {
FileInputStream file = new FileInputStream(new File("D:\\home\\test_in.xlsx"));
final FileOutputStream fileOut = new FileOutputStream("D:\\home\\test_out.xlsx");
XSSFWorkbook workbook = null;
try {
workbook = new XSSFWorkbook(file);
final XSSFSheet sheet = workbook.getSheetAt(0);
final Iterator<Row> rowIterator = sheet.iterator();
final Set<String> orderIds = new HashSet<String>();
while (rowIterator.hasNext()) {
final Row row = rowIterator.next();
final int rowNumber = row.getRowNum();
// SKIP HEADER
if (rowNumber == 0) {
continue;
}
System.out.print("Row " + rowNumber);
// GET ORDER ID CELL
final Cell cell = row.getCell(1);
if (!orderIds.add(cell.getStringCellValue())) {
// CREATE DOUBLE ORDER CELL
row.createCell(2).setCellValue("Duplicate");
System.out.println(" " + cell.toString() + " is Duplicate.");
} else {
System.out.println(" Order is Unique");
}
}
workbook.write(fileOut);
} finally {
workbook.close();
file.close();
fileOut.close();
}
}
You can write the word Duplicate in a loop iterating rows, you don't need to hold array of values. To check duplicates you can use Set and check if add returns false.
e.g.
public static void main(String... args) throws IOException {
try (FileInputStream file = new FileInputStream(new File(args[0]));
XSSFWorkbook workbook = new XSSFWorkbook(file)) {
Set<String> orders = new HashSet<>(20);
XSSFSheet sheet = workbook.getSheetAt(0);
for (Row row : sheet) {
// skip header if needed
if (row.getRowNum() == 0) {
continue;
}
// 1 for OrderId as in example
Cell orderCell = row.getCell(1);
if (!orders.add(getValue(orderCell))) {
Cell infoCell = row.getCell(2);
if (infoCell == null) {
infoCell = row.createCell(2);
}
infoCell.setCellValue("Duplicate");
}
}
// write result in new file
workbook.write(new FileOutputStream(new File(args[0] + ".result.xlsx")));
}
}
private static String getValue(Cell orderCell) {
switch (orderCell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
return Double.toString(orderCell.getNumericCellValue());
case Cell.CELL_TYPE_STRING:
return orderCell.getStringCellValue();
}
return null;
}
I am trying to export data from a database to Excel. I have the data exported and currently being stored in an ArrayList (this can be changed). I have been able to export the data to excel but all of the values are being exported as Strings, I need them to keep their data type i.e currency/numeric.
I am using Apache POI and am having difficult with setting the data type of the fields to anything other than String. Am I missing something? Can someone please advise me on a better way of doing this? Any assistance on this would be greatly appreciated.
public static void importDataToExcel(String sheetName, ArrayList header, ArrayList data, File xlsFilename, int sheetNumber)
throws HPSFException, FileNotFoundException, IOException {
POIFSFileSystem fs = new POIFSFileSystem();
HSSFWorkbook wb = new HSSFWorkbook(new FileInputStream(xlsFilename));
HSSFSheet sheet = wb.createSheet(sheetName);
int rowIdx = 0;
short cellIdx = 0;
// Header
HSSFRow hssfHeader = sheet.createRow(rowIdx);
HSSFCellStyle cellStyle = wb.createCellStyle();
cellStyle.setAlignment(HSSFCellStyle.ALIGN_CENTER);
for (Iterator cells = header.iterator(); cells.hasNext();) {
HSSFCell hssfCell = hssfHeader.createCell(cellIdx++);
hssfCell.setCellStyle(cellStyle);
hssfCell.setCellValue((String) cells.next());
}
// Data
rowIdx = 1;
for (Iterator rows = data.iterator(); rows.hasNext();) {
ArrayList row = (ArrayList) rows.next();
HSSFRow hssfRow = (HSSFRow) sheet.createRow(rowIdx++);
cellIdx = 0;
for (Iterator cells = row.iterator(); cells.hasNext();) {
HSSFCell hssfCell = hssfRow.createCell(cellIdx++);
hssfCell.setCellValue((String) cells.next());
}
}
Logfile.log("sheetNumber = " + sheetNumber);
wb.setSheetName(sheetNumber, sheetName);
try {
FileOutputStream out = new FileOutputStream(xlsFilename);
wb.write(out);
out.close();
} catch (IOException e) {
throw new HPSFException(e.getMessage());
}
}
You need to check for the class of your cell value before you cast:
public static void importDataToExcel(String sheetName, List<String> headers, List<List<Object>> data, File xlsFilename, int sheetNumber)
throws HPSFException, FileNotFoundException, IOException {
POIFSFileSystem fs = new POIFSFileSystem();
Workbook wb;
try {
wb = WorkbookFactory.create(new FileInputStream(xlsFilename));
} catch (InvalidFormatException ex) {
throw new IOException("Invalid workbook format");
}
Sheet sheet = wb.createSheet(sheetName);
int rowIdx = 0;
int cellIdx = 0;
// Header
Row hssfHeader = sheet.createRow(rowIdx);
CellStyle cellStyle = wb.createCellStyle();
cellStyle.setAlignment(HSSFCellStyle.ALIGN_CENTER);
for (final String header : headers) {
Cell hssfCell = hssfHeader.createCell(cellIdx++);
hssfCell.setCellStyle(cellStyle);
hssfCell.setCellValue(header);
}
// Data
rowIdx = 1;
for (final List<Object> row : data) {
Row hssfRow = sheet.createRow(rowIdx++);
cellIdx = 0;
for (Object value : row) {
Cell hssfCell = hssfRow.createCell(cellIdx++);
if (value instanceof String) {
hssfCell.setCellValue((String) value);
} else if (value instanceof Number) {
hssfCell.setCellValue(((Number) value).doubleValue());
} else {
throw new RuntimeException("Cell value of invalid type " + value);
}
}
}
wb.setSheetName(sheetNumber, sheetName);
try {
FileOutputStream out = new FileOutputStream(xlsFilename);
wb.write(out);
out.close();
} catch (IOException e) {
throw new HPSFException(e.getMessage());
}
}
I have also added in generics - this makes the code a lot more readable. Also you need to avoid using the actual class where possible and use the interface, for example List not ArrayList and Row not HSSFRow.
How do i get the index of the last column when reading a xlsx file using the Apache POI API?
There's a getLastRowNum method, but I can't find nothing related to the number of columns...
EDIT:
I'm dealing with XLSX files
I think you'll have to iterate through the rows and check HSSFRow.getLastCellNum() on each of them.
Check each Row and call Row.getLastCellNum() the max cell number is the last column number.
Row r = sheet.getRow(rowNum);
int maxCell= r.getLastCellNum();
To get to know the last column that has value of any row , First you need to get the row and then you can find the last column that has value
Syntax :
sheet.getrow(RowNumber).getLastCellNum();
RowNumber --> is the row number for which you want to know the last column that has value
Try this function:
private void maxExcelrowcol() {
int row, col, maxrow, maxcol;
//Put file name here for example filename.xls
String filename = "filename.xls";
static String TAG = "ExelLog";
//you can use 'this' in place of context if you want
Context context = getApplicationContext();
try {
// Creating Input Stream
File file = new File(context.getExternalFilesDir(null), filename);
FileInputStream myInput = new FileInputStream(file);
// Create a POIFSFileSystem object
POIFSFileSystem myFileSystem = new POIFSFileSystem(myInput);
// Create a workbook using the File System
HSSFWorkbook myWorkBook = new HSSFWorkbook(myFileSystem);
// Get the first sheet from workbook
HSSFSheet mySheet = myWorkBook.getSheetAt(0);
//Row iterator
Iterator rowIter = mySheet.rowIterator();
while (rowIter.hasNext()) {
HSSFRow myRow = (HSSFRow) rowIter.next();
//Cell iterator for iterating from cell to next cell of a row
Iterator cellIter = myRow.cellIterator();
while (cellIter.hasNext()) {
HSSFCell myCell = (HSSFCell) cellIter.next();
row = myCell.getRowIndex();
col = myCell.getColumnIndex();
if (maxrow < row) {
maxrow = row;
}
if (maxcol < col) {
maxcol = col;
}
}
}
} catch(FileNotFoundException e) {
e.printStackTrace();
} catch(IOException e) {
e.printStackTrace();
}
}