How to write to an existing file using SXSSF? - java

I have an .xlsx file with multiple sheets containing different data. Of all the sheets one sheet needs to accommodate close to 100,000 rows of data, and the data needs to be written using Java with poi.
This seems quite fast and simple with SXSSFWorkbook, where I can keep only 100 rows in memory, but the disadvantage is that I can only write to a new file (or overwrite existing file).
Also, I am not allowed to 'load' an existing file, i.e
SXSSFWorkbook wb = new SXSSFWorkbook(file_input_stream) is not allowed.
I can use Workbook factory:
Workbook workbook = new SXSSFWorkbook();
workbook = WorkbookFactory.create(file_input_stream);
but when the time comes for me to flush the rows,
((SXSSFSheet)sheet).flushRows(100);
I get the error that type conversion is not allowed from XSSFSheet to SXSSFSheet.
I tried to see if there was any way to copy sheets across different workbooks, but so far it seems it has to be done cell by cell.
Any insights on how to approach this problem?

You are probably having a template to which you want to add large data. You need to use the SXSSFWorkbook(XSSFWorkbook) constructor:
XSSFWorkbook wb = new XSSFWorkbook(new File("template.xlsx"));
SXSSFWorkbook wbss = new SXSSFWorkbook(wb, 100);
Sheet sheet = wbss.createSheet("sheet1");
// now add rows to sheet

Related

Questions on SXSSFWorkbook about it's FlushedRows, Written to Disk and rowAccessWindowSize

I need to write a millions of record in an Excel(.xlsx) file on an already existing template(.xlsx). Initially I was using XSSFWorkbook and that obviously leads me to OOM issue.
Then later, I have changed to SXSSFWorkbook to avoid the OOM issue like below,
FileInputStream fis = new FileInputStream(file);
OPCPackage pkg = OPCPackage.open(fis);
XSSFWorkbook mainBook = new XSSFWorkbook(pkg);
SXSSFWorkbook wb = new SXSSFWorkbook(mainBook,200);
Sheet sh = wb.getSheet("Sheet1");
Row row0 = sh.createRow(0);
In SXSSFWorkbook, we can't modify the existing template and so I kept the template empty to write the data with column headers as well.
But on row0 = sh.createRow(0);, It is throwing error like "java.lang.IllegalArgumentException: Attempting to write a row[0] in the range [0,106403] that is already written to disk"
I am not at all sure like, How "106403" written to disk and what should i do further ?
So arises the doubt on these three,
What is FlushedRows and how it is flushing the rows 106403 while I am trying to create a new row?
What is "Written to Disk"?
While Initializing "SXSSFWorkbook" with parameter "rowAccessWindowSize", in my case it is 200 and what is rowAccessWindowSize and what it will do?
SXSSFWorkbook is for writing only. When a template XSSFWorkbook is used, then while creating SXSSFWorkbook from that XSSFWorkbook a temporary file is created for each sheet in that XSSFWorkbook and all existing rows in those sheets are written into those temporary files. Later on only new rows can be streamed into those temporary files.
The rowAccessWindowSize sets the count of rows that kept in memory before they will flushed into the temporary files. All rows that are written already to the temporary sheet file cannot be accessed anymore later because they are not more in memory but only in the temporary file. That's why the low memory usage of SXSSF.
The error message java.lang.IllegalArgumentException: Attempting to write a row[0] in the range [0,106403] that is already written to disk. tells you that the rows with indexes 0 to 106403 (rows 1 to 106404) are already written to disk. That tells you that your template sheet Sheet1 is not empty. At least in row 106404 must be data. That's why the rows 1 to 106404 were written to Sheet1' s temporary file while SXSSFWorkbook wb = new SXSSFWorkbook(mainBook,200);. Later on only rows greater than row number 106405 can be created new on SXSSFSheet.

Force read only first sheet in Apache POI

I am using Apache POI to read the data only in the first sheet of an excel file. The xlsx files that are submitted usually have only 1 sheet and are around 2.5MB (with a little more than 130k rows of data), and everything goes slow but smooth with no errors. However, if the submitted xlsx has more than one sheet, and if the other sheet(s) also have a lot of data in them, the execution throws an OutOfMemoryError: Java heap space error. Now I am trying to figure out if it somehow possible to always only read the data on the first sheet without worrying about the memory errors (i am running this with -Xmx1024m -Xms512m arguments)
EDIT: here is my code
InputStream inputStream = new FileInputStream(new File(excelfile));
XSSFWorkbook workbook = new XSSFWorkbook(inputStream);
if (workbook.getNumberOfSheets() != 1) {
throw new Exception("Make sure excel only has 1 sheet");
}
The program is throwing an error on the second line (if the excel file has a lot of data on the second sheet as well)
Apache POI usually triggers a lot of issues related to memory, I strongly recommend to use monitorjbs instead https://github.com/monitorjbl/excel-streaming-reader
InputStream is = new FileInputStream(new File(filePath));
Workbook workbook = StreamingReader.builder()
.rowCacheSize(100) // number of rows to keep in memory (defaults to 10)
.bufferSize(2048) // buffer size to use when reading InputStream to file (defaults to 1024)
.open(is)) {
Sheet sheet = workbook.getSheetAt(0);

Lock excel work book using apache poi api java

I have to lock a workbook while reading and writing a row.
Is there way to accomplish this using Apache POI API without protecting and password option?.
As a suggestion, paste some of your code that could help others to understand and solve the problem.
As far as I know you can't "lock" the hole Workbook but one thing you can try is to apply a CellStyle to cells you want to protect while reading or writing.
Workbook wb = new XSSFWorkbook();
CellStyle lockedCellStyle = wb.createCellStyle();
lockedCellStyle.setLocked(true);
Sheet sheet = wb.createSheet();
// .... Create rows and cells as needed
// When Writing or reading
Cell cell = getCellsToLockWithAnyMethod();
cell.setCellStyle(lockedCellStyle);

Out of Memory Error - Java Heap Space while writing to Excel

I have a data of almost 100,000 records and I am trying to write the data to .xlsx file using XSSFWorkbook through Java code. I am able to fetch all the data from database to an ArrayList. By iterating the ArryList, I am writing the data to .xlsx file cell by cell.
As it reaches to 8000 rows, java code throws Out of Memory Heap Space Error.
I have read somewhere that SXSSFWorkbook will be lighter when compared to XSSFWorkbook, so I tried using SXSSFWorkbook. But still I am facing the same problem.
So is there anything that I am missing with the Workbooks or with my Java Code??
Initially, when I have 60,000 records data, I had used .xls file. The same java code is able to generate the .xls file with HSSFWorkbook.
Increasing the Java Heap Space is not at all an option as my data will be increased tremendously in future.
Any help will be greatly appreciated.
Small piece of code, the way I am writing the data to Excel.
int rowNum = sheet.getLastRowNum();
Row lastRow = null ;
Cell cell = null;
ReportingHelperVo reportingHelperVo = null;
for (ReportingVo reportingVo : reportingVos) {
rowNum++;
lastRow = sheet.createRow(rowNum);
reportingHelperVo = reportingVo.reportingHelperVo;
cell = lastRow.createCell(0);
cell.setCellValue(reportingHelperVo.getLocation());
cell.setCellStyle(style);
cell = lastRow.createCell(1);
cell.setCellValue(reportingHelperVo.getCity());
cell.setCellStyle(style);
cell = lastRow.createCell(2);
cell.setCellValue(reportingHelperVo.getCountry());
cell.setCellStyle(style);
}
SXSSFWorkbook is not like light weight,but there is a advantage with this.
If you declare as
SXSSFWorkbook workbook= new SXSSFWorkbook(200);
then for every 200 rows written on the workbook, memory will be flushed to diskspace so there will be no burden in heapspace.
XSSFWorkbook - creates an object representation for all Excel documents (should work like DOM).
SXSSFWorkbook - should require constant memory. When is OOM thrown by JMV? What type of ResultSet did you use? Try to use FORWARD_ONLY to restrict caching data by JDBC driver retrieved from DB.
BTW best weay to fix OutOfMemoryError is to analyze heap dump.
Use -XX:+HeapDumpOnOutOfMemoryError parameter and MAT to understand how your application works.
I am writing the data to .xlsx file cell by cell. As it reaches to
8000 rows, java code throws Out of Memory Heap Space Error.
Re-use exsiting java objects, instead of creating new ones each iteration.
And/or use a csv file instead of excel.
Workbook workBook = new SXSSFWorkbook();
You can export more than 1 lakh (100000) records.
Page results from the database rather than reading them all in one go.
I had similar problems long time ago attempting to write from R to an excel file (but using XLConnection).
In the end, I solved by using write.csv and then opening it with Excel and using the botton "Text to column".
It is increadibly fast and reliable.
I have got the same issue when my excel was reaching 3000 lines. In my case the main memory related issue with POI Excel generation happens with the style sheet. Following are the things which I removed from my code.
Try to use style sheet setting in a row level.
If at all you need to set the style sheet for every cell.. avoid setting border for each and every cell.
Hi Use latest Apache POI JAR//And Use SXSSF for streaming or downloading
SXSSFWorkbook workbook = new SXSSFWorkbook(100);
workbook.setCompressTempFiles(true);
Sheet sh = workbook.createSheet();
((SXSSFSheet) sh).setRandomAccessWindowSize(100);
//write your logic
response.setContentType("application/vnd.ms-excel");
response.setHeader("Content-Disposition", "attachment;
filename="+filename+".xlsx");
workbook.write(response.getOutputStream());
workbook.close();
workbook.dispose();
I had an out of memory issues writing a XSSFWorkbook to a file.
The Suggestions above helped a lot.
See http://poi.apache.org/components/spreadsheet/how-to.html#xssf_sax_api
Changing XSSFWorkbook wb = new XSSFWorkbook();
to SXSSFWorkbook wb = new SXSSFWorkbook(-1);
SXSSFSheet sh = ... to SXSSFSheet sh = ...
XSSFRow to SXSSFRow
XSSFCell to SXSSFCell
inside the for loop USE sh.flushRows(100); for every 100th row
after wb.write(out);
ADD wb.dispose();

how to copy one workbook sheet to another workbook sheet using apache POI and java [duplicate]

This question already has answers here:
Copying Excel Worksheets in POI
(6 answers)
Closed 3 years ago.
I have one excel file with single sheet (abstract model). Now I want to copy the sheet to another existing workbook. How can I do this?
For processing regular styles and data we could iterate through each cells.
But if we have formula enabled cell,non editable cells, merged cell then this is the better solution to go for:
I have tried something like copying sheets but it didn't worked.
In addition with that i need to copy a sheet in which there are some non editable cells and formula enabled cells too.
This solution worked for me:
1)copy the entire workbook
2)delete unnecessary sheets
3)add your new sheets to above book
//input source excel file which contains sheets to be copied
file = new FileInputStream("C:\\SamepleTemplate.xlsx");
workbookinput = new XSSFWorkbook(file);
//output new excel file to which we need to copy the above sheets
//this would copy entire workbook from source
XSSFWorkbook workbookoutput=workbookinput;
//delete one or two unnecessary sheets, you can delete them by specifying the sheetnames
workbook.removeSheetAt(workbook.getSheetIndex(workbook.getSheet(" your sheet name ")));
//if you want to delete more sheets you can use a for to delete the sheets
for (int index=0;index<=5;index++)
{
workbook.removeSheetAt(index);
}
//To write your changes to new workbook
FileOutputStream out = new FileOutputStream("C:\\FinalOutput.xlsx");
workbookoutput.write(out);
out.close();

Categories

Resources