Unable to write new excel using Apache POI after removing duplicate rows - java

I am new to Apache POI.
I have written a small code for removing duplicate records from a excel file. I am successfully able to identify the duplicate records across sheets but when writing to a new file after removing records, no output is being generated.
Please help where I am goin wrong?
Am I writing properly ?? Or am missing something?
public static void main(String args[]) {
DataFormatter formatter = new DataFormatter();
HSSFWorkbook input_workbook;
HSSFWorkbook workbook_Output_Final;
HSSFSheet input_workbook_sheet;
HSSFRow row_Output;
HSSFRow row_1_index;
HSSFRow row_2_index;
String value1 = "";
String value2 = "";
int count;
//main try catch block starts
try {
FileInputStream input_file = new FileInputStream("E:\\TEST\\Output.xls"); //reading from input file
input_workbook = new HSSFWorkbook(new POIFSFileSystem(input_file));
for (int sheetnum = 0; sheetnum < input_workbook.getNumberOfSheets(); sheetnum++) { //traversing sheets
input_workbook_sheet = input_workbook.getSheetAt(sheetnum);
int input_workbook_sheet_total_row = input_workbook_sheet.getLastRowNum(); //fetching last row nmber
for (int input_workbook_sheet_row_1 = 0; input_workbook_sheet_row_1 <= input_workbook_sheet_total_row; input_workbook_sheet_row_1++) { //traversing row 1
for (int input_workbook_sheet_row_2 = 0; input_workbook_sheet_row_2 <= input_workbook_sheet_total_row; input_workbook_sheet_row_2++) {
row_1_index = input_workbook_sheet.getRow(input_workbook_sheet_row_1); //fetching one iteration row index
row_2_index = input_workbook_sheet.getRow(input_workbook_sheet_row_2); //fetching sec iteration row index
if (row_1_index != row_2_index) {
count = 0;
value1 = "";
value2 = "";
for (int row_1_index_cell = 0; row_1_index_cell < row_1_index.getLastCellNum(); row_1_index_cell++) { //traversing cell for each row
try {
value1 = value1 + formatter.formatCellValue(row_1_index.getCell(row_1_index_cell)); //fetching row cells value
value2 = value2 + formatter.formatCellValue(row_2_index.getCell(row_1_index_cell)); //fetching row cells value
} catch (NullPointerException e) {
}
count++;
if (count == row_1_index.getLastCellNum()) {
if (value1.hashCode() == value2.hashCode()) { //remove the duplicate logic
System.out.println("deleted : " + row_2_index);
System.out.println("------------------");
input_workbook_sheet.removeRow(row_2_index);
}
}
}
}
}
}
}
FileOutputStream fileOut = new FileOutputStream("E:\\TEST\\workbook.xls");
input_workbook.write(fileOut);
fileOut.close();
input_file.close();
} catch (Exception e) {
//e.printStackTrace();
}
//main try catch block ends
}

A couple of things to note:
you swallow any kind of Exception; Igotsome nullpointers with my test data, and that would prevent the workbook from being written
when removing rows, it is an old trick to move backwards through the row numbers because then you don't have to adjust for the row number you have just removed
the code empties the row, but it doesn't move all rows upwards (=there is a gap after the delete). If you want to remove that gap, you can work with shiftRows
you compare things by hashcode, which is possible (in some use cases), but I feel like .equals() is what you want to do. See also Relationship between hashCode and equals method in Java
Here's some code that worked for my test data, feel free to comment if something doesn't work with your data:
public static void main(String args[]) throws IOException {
DataFormatter formatter = new DataFormatter();
HSSFWorkbook input_workbook;
HSSFWorkbook workbook_Output_Final;
HSSFSheet input_workbook_sheet;
HSSFRow row_Output;
HSSFRow row_1_index;
HSSFRow row_2_index;
String value1 = "";
String value2 = "";
int count;
FileInputStream input_file = new FileInputStream("c:\\temp\\test.xls");
input_workbook = new HSSFWorkbook(new POIFSFileSystem(input_file));
for (int sheetnum = 0; sheetnum < input_workbook.getNumberOfSheets(); sheetnum++) {
input_workbook_sheet = input_workbook.getSheetAt(sheetnum);
int input_workbook_sheet_total_row = input_workbook_sheet.getLastRowNum();
for (int input_workbook_sheet_row_1 = input_workbook_sheet_total_row; input_workbook_sheet_row_1 >=0; input_workbook_sheet_row_1--) { // traversing
for (int input_workbook_sheet_row_2 = input_workbook_sheet_total_row; input_workbook_sheet_row_2 >= 0 ; input_workbook_sheet_row_2--) {
row_1_index = input_workbook_sheet.getRow(input_workbook_sheet_row_1);
row_2_index = input_workbook_sheet.getRow(input_workbook_sheet_row_2);
if (row_1_index != null && row_2_index != null && row_1_index != row_2_index) {
count = 0;
value1 = "";
value2 = "";
int row_1_max = row_1_index.getLastCellNum() - 1;
for (int row_1_index_cell = 0; row_1_index_cell < row_1_max; row_1_index_cell++) {
try {
value1 = value1 + formatter.formatCellValue(row_1_index.getCell(row_1_index_cell));
value2 = value2 + formatter.formatCellValue(row_2_index.getCell(row_1_index_cell));
} catch (NullPointerException e) {
e.printStackTrace();
}
count++;
if (value1.equals(value2)) {
System.out.println("deleted : " + row_2_index.getRowNum());
System.out.println("------------------");
input_workbook_sheet.removeRow(row_2_index);
input_workbook_sheet.shiftRows(
row_2_index.getRowNum() + 1,
input_workbook_sheet_total_row,
-1,
true,
true);
}
}
}
}
}
}
FileOutputStream fileOut = new FileOutputStream("c:\\temp\\workbook.xls");
input_workbook.write(fileOut);
fileOut.close();
input_file.close();
input_workbook.close();
}

Related

Blank row after header

I am writing existing excel by merging many excel files, after generating of final excel file blank row is adding up after headers.
Below is my code which reads data from multiple files and write to particular blank file which have pivot formulas set.
I tried even by
1. Setting createRow(0) , then started filling data from next row.
2. Tried of maintaining int counter, but still didn't work
3. Tried incrementing getLastRowNum() count, but no use
public class DCSReadImpl implements ReadBehavior {
Logger log = Logger.getLogger(DCSReadImpl.class.getName());
#SuppressWarnings("resource")
#Override
public Sheet readReport(Workbook workbook,Map<String,String> masterMap, Properties properties) {
//int firstRow = 0;
int outRowCounter = 0;
String fileToMove= "";
boolean headers = true;
Row outputRow = null;
Sheet outputSheet = null;
Workbook wb = new XSSFWorkbook();
try {
outputSheet = wb.createSheet("Data");
log.info("**** Set headers start"); // this used to be different method
int cellNo = 0;
outputRow = outputSheet.createRow(0);
for(String headerName : ReportConstants.DCS_OUTPUT_HEADER){
outputRow.createCell(cellNo).setCellValue(headerName);
cellNo++;
}
//outRowCounter++;
log.info("**** Set headers completed");
log.info("Read input file(s) for DCS report");
log.info("Input File Path : " + properties.getProperty(ReportConstants.DCS_INPUT_PATH));
File inputDir = new File(properties.getProperty(ReportConstants.DCS_INPUT_PATH));
File[] dirListing = inputDir.listFiles();
if (0 == dirListing.length) {
throw new Exception(properties.getProperty(ReportConstants.DCS_INPUT_PATH) + " is empty");
}
for (File file : dirListing) {
log.info("Processing : " + file.getName());
fileToMove = file.getName();
XSSFWorkbook inputWorkbook = null;
try {
inputWorkbook = new XSSFWorkbook(new FileInputStream(file));
} catch (Exception e) {
throw new Exception("File is already open, please close the file");
}
XSSFSheet inputsheet = inputWorkbook.getSheet("Sheet1");
Iterator<Row> rowItr = inputsheet.iterator();
int headItr = 0;
//log.info("Validating headers : " + file.getName());
while (rowItr.hasNext()) {
Row irow = rowItr.next();
Iterator<Cell> cellItr = irow.cellIterator();
int cellIntItr = 0;
String key = "";
int rowN = outputSheet.getLastRowNum() + 1;
outputRow = outputSheet.createRow(rowN);
Cell outCell = null;
while (cellItr.hasNext()) {
Cell inputCell = cellItr.next();
if (0 == inputCell.getRowIndex()) {
if (!FileUtility.checkHeaders(headItr, inputCell.getStringCellValue().trim(),
ReportConstants.DCS_INPUT_HEADER)) {
throw new Exception("Incorrect header(s) present in Input File, Expected : "
+ ReportConstants.DCS_INPUT_HEADER[headItr]);
}
headItr++;
} else {
//outCell = outputRow.createCell(cellIntItr);
if (0 == inputCell.getColumnIndex()) {
key = inputCell.getStringCellValue().trim();
} else if (2 == inputCell.getColumnIndex()) {
key = key + ReportConstants.DEL + inputCell.getStringCellValue().trim();
}
if (7 == cellIntItr){
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(getValue(masterMap, key, 0));
cellIntItr++;
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(getValue(masterMap, key, 1));
cellIntItr++;
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(getValue(masterMap, key, 2));
cellIntItr++;
}
// Check the cell type and format accordingly
switch (inputCell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(inputCell.getNumericCellValue());
break;
case Cell.CELL_TYPE_STRING:
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(inputCell.getStringCellValue().trim());
break;
}
cellIntItr++;
}
}
//outRowCounter ++ ;
}
if(!fileToMove.isEmpty()){
FileUtility.checkDestinationDir(""+properties.get(ReportConstants.DCS_ARCHIVE_PATH));
FileUtility.moveFile(properties.get(ReportConstants.DCS_INPUT_PATH) + fileToMove,
properties.get(ReportConstants.DCS_ARCHIVE_PATH)+fileToMove+FileUtility.getPattern());
}
}
} catch (Exception e) {
log.error("Exception occured : ", e);
}
FileOutputStream outputStream;
try {
outputStream = new FileOutputStream("D:\\DCS\\Output\\Krsna_"+FileUtility.getPattern()+".xlsx");
wb.write(outputStream);
} catch (Exception e) {
e.printStackTrace();
}
return outputSheet;
}
private String getValue(Map<String, String> masterMap, String cellKey, int index) {
String value = masterMap.get(cellKey);
if (null != value) {
String cellValue[] = value.split("\\" + ReportConstants.DEL);
return cellValue[index];
} else {
return "";
}
}
}
There should not be blank row after header row. That is in between of 0th row and 1st row (hope my understanding is correct on row indexing). I know this is very basic question :-(

read multiple excel sheet selenium-webdriver, java, eclipse

I want to run selenium-webdriver-java-eclipse, using excel file contains multiple excel sheets with different name(sheet1,sheet2,sheet3,...), i need a for loop help me to do that and read from this sheets.
public class ExcelDataConfig {
XSSFWorkbook wb;
XSSFSheet sheet = null;
public ExcelDataConfig(String Excelpath) throws IOException {
// TODO Auto-generated method stub
try {
File file = new File(Excelpath);
// Create an object of FileInputStream class to read excel file
FileInputStream fis = new FileInputStream(file);
wb = new XSSFWorkbook(fis);
} catch (Exception e) {
}
}
public String GetData(int sheetNumber, int Row, int Column) {
Iterator<Row> rowIt=sheet.rowIterator();
DataFormatter formatter = new DataFormatter();
XSSFCell cell = sheet.getRow(Row).getCell(Column);
String data = formatter.formatCellValue(cell);
return data;
}
public int GetRowCount(String sheetNumber) {
int row = wb.getSheet(sheetNumber).getLastRowNum();
row = row + 1;
return row;
}
}
try something like this, it is working for me you need to add the sheet numbers and cell numbers at the places of k and j
enter code here
String filePath="C:\\Users\\USER\\Desktop\\Book1.xlsx";// file path
FileInputStream fis=new FileInputStream(filePath);
Workbook wb=WorkbookFactory.create(fis);
ArrayList<String> ls=new ArrayList<String>();
for(int k=0; k<=3;k++)//k =sheet no
{
Sheet sh=wb.getSheetAt(k);
System.out.println(sh);
// int count=0;
for(int i=0;i<=sh.getLastRowNum();i++)
{
System.out.println("row no:"+i);
for(int j=0; j<=4;j++)//j=column no
{
try {
String values=sh.getRow(i).getCell(j).getStringCellValue().trim();
System.out.println(values);
//condetions
/* if(values.contains("condtn1"))
{
System.out.println("Value of cell "+values+" ith row "+(i+1));
ls.add(values);
count++;
}
if(values.contains("condn2"))
{
System.out.println("Value of cell "+values+" ith row "+(i+1));
ls.add(values);
count++;
}*/
}catch(Exception e){
}
}
}
}
}
}
Please try writing similar to something like this:
for (int i = startRow; i < endRow + 1; i++) {
for (int j = startCol; j < endCol + 1; j++) {
testData[i - startRow][j - startCol] = ExcelWSheet.getRow(i).getCell(j).getStringCellValue();
Cell cell = ExcelWSheet.getRow(i).getCell(j);
testData[i - startRow][j - startCol] = formatter.formatCellValue(cell);
}
}
Terms used in method are pretty self explanatory. Let us know if you get stuck or need more info.

Apache POI - Reading excel file in 2D array - returning null values

I am trying to read Excel -2*2 matrix through Apache POI. But the first value returned by 2D array is [null,null]. Please check my code and advise for suitable corrections.
public String[][] getDataArray(String sheetName)
{
String value ="";
String[][] data = null;
int rowCount = wb.getSheet(sheetName).getLastRowNum();
int colCount = wb.getSheet(sheetName).getRow(1).getLastCellNum()-1;
data = new String[rowCount][colCount];
for(int i=1; i<=rowCount;i++)
{
Row row = wb.getSheet(sheetName).getRow(i);
for(int j=0;j<colCount;j++)
{
Cell cell = row.getCell(j);
if(cell.getCellType()==Cell.CELL_TYPE_NUMERIC)
{
value = ""+cell.getStringCellValue();
}
else
{
value = cell.getStringCellValue();
}
data[i][j] = value;
}
}
return data;
}
The debug view where we can see that the first value stored in the variable data is null, null
The excel which i am trying to read. I need only the userName and password data(2*2) alone. Not the header and Run mode datas.
Of course the value in the index 0 will be null because the i starts from 1 and not 0
for (int i = 1; i <= rowCount; i++) //i starts from one
...
data[i][j] = value;
either initialize the i from 0 or do like this
data[i-1][j] = value;
public static String[][] getSheetData(final String fileName, final String workSheetName)
throws Exception {
Integer lastRow = null;
short lastCol = 0;
String[][] sheetData = null;
FileInputStream file=new FileInputStream(MettlTest.class.getClass().getResource("/" + fileName).getPath());
workbook = new XSSFWorkbook(file);
sheet = workbook.getSheet(workSheetName);
try {
XSSFRow row;
XSSFCell cell;
lastRow = sheet.getPhysicalNumberOfRows();
lastCol = sheet.getRow(1).getLastCellNum();
sheetData = new String[lastRow - 1][lastCol];
for (int r = 1; r < lastRow; r++) {
row = sheet.getRow(r);
if (row != null) {
for (int c = 0; c < lastCol; c++) {
cell = row.getCell(c);
if (cell == null) {
sheetData[r][c] = null;
} else {
sheetData[r-1][c] = new DataFormatter().formatCellValue(cell);
}
}
}
}
return sheetData;
}
catch (final Exception e) {
throw e;
}
finally {
try {
file.close();
} catch (IOException io) {
Reporter.log("Unable to close File : " + fileName);
throw io;
}
}

String Array from excel column

How can I get a string array from a excel column?
Let's say the column is like this
String0
String1
String2
String3
String4
and I want my array to be like: array[0]="String0", array[1]="String1" etc.
This is the code I am currently using but it always returns "null":
public static String[] excelvalue(String columnWanted, int sheet_no, String path) {
int i = 0;
String[] column_content_array = new String[140];
try {
int instindicator = -1;
FileInputStream file = new FileInputStream(new File(path));
HSSFWorkbook filename = new HSSFWorkbook(file);
HSSFSheet sheet = filename.getSheetAt(sheet_no);
Integer columnNo = null;
Integer rowNo = null;
List<Cell> cells = new ArrayList<Cell>();
Row firstRow = sheet.getRow(0);
for (Cell cell : firstRow) {
if (cell.getStringCellValue().equals(columnWanted)) {
columnNo = cell.getColumnIndex();
rowNo = cell.getRowIndex();
}
}
if (columnNo != null) {
for (Row row : sheet) {
Cell c = row.getCell(columnNo);
String cell_value = "" + c;
cell_value = cell_value.trim();
try {
if ((!cell_value.equals("")) && (!cell_value.equals("null")) && (!cell_value.equals(columnWanted))) {
column_content_array[i] = cell_value;
i++;
}
} catch (Exception e) {
}
}
return column_content_array;
}
} catch (Exception ex) {
return column_content_array;
}
return column_content_array;
}
Instead of storing just last reference of row and column, store all of them in a list like:
List<Integer> columnNos = new ArrayList<>();
List<Integer> rowNos = new ArrayList<>();
And in your for loop, just add rows and columns into list like:
if (cell.getStringCellValue().equals(columnWanted)) {
columnNos.add(cell.getColumnIndex());
rowNo.add(cell.getRowIndex());
}
And then you could iterate over rows and columns and continue with your business logic further.

Selenium Data driven test- number read as decimal from excel

In the test some fields accepts numbers and it has been given in the excel. But when the data is read from the excel, it reads as decimals. for example number 22 is read as 22.0. I have used apache.poi for the test. How to avoid decimal.
public static void main(String[] args) throws InterruptedException {
try{
FileInputStream input = new FileInputStream("C:Users\\dinu\\Desktop\\RegDetails.xls");
HSSFWorkbook wb = new HSSFWorkbook(input);
HSSFSheet sheet = wb.getSheet("RegDetails");
for(int count =1; count <= sheet.getLastRowNum(); count++){
HSSFRow row = sheet.getRow(count);
System.out.println("Running Test Case "+row.getCell(0).toString());
runTest(row.getCell(1).toString(),row.getCell(2).toString(),row.getCell(3).toString(),
row.getCell(4).toString(),row.getCell(5).toString(),row.getCell(6).toString(),
row.getCell(7).toString(),row.getCell(8).toString());
}input.close();
}catch (IOException e){
System.out.println("Test data not found.");
}
Use this,it will work...
for (int i = 0; i < sheet.getLastRowNum(); i++) {
Row row = sheet.getRow(i + 1);
for (int k = 0; k < sheet.getRow(0).getLastCellNum(); k++) {
Cell cell = row.getCell(k);
String value;
try {
value = cell.getRichStringCellValue().toString();
} catch (Exception e) {
value = ((XSSFCell) cell).getRawValue();
}
data[i][k] = value;
}

Categories

Resources