JAVA: How to write to .xlsx with a low memory footprint - java

I have an input .xlsx file f1 and I am trying to create another file f2 using f1 after some data modification.
Currently I am trying to use apache-poi streaming api to accomplish this task. The problem is that I am unable to accomplish this task with a low memory footprint. Here is my code snippet
import com.monitorjbl.xlsx.StreamingReader;
public static void streamingWriter()
{
SXSSFWorkbook workbook = new SXSSFWorkbook(50);
workbook.setCompressTempFiles(true);
try (InputStream inputStream = new FileInputStream(new File("inputFile.xlsx"));
Workbook inputWorkbook = StreamingReader.builder()
.rowCacheSize(50)
.bufferSize(512)
.open(inputStream)) {
Runtime runtime = Runtime.getRuntime();
Sheet newSheet;
Row newRow;
Cell newCell;
for (Sheet sheet : inputWorkbook) {
newSheet = workbook.createSheet();
for (Row row : sheet) {
newRow = newSheet.createRow(row.getRowNum());
for (Cell cell : row) {
newCell = newRow.createCell(cell.getColumnIndex());
copyCell(cell, newCell, workbook);
}
}
}
System.out.println("Mem2: " + (runtime.totalMemory() - runtime.freeMemory()));
String fileName = "outputFile.xlsx";
FileOutputStream outputStream = new FileOutputStream(fileName);
workbook.write(outputStream);
System.out.println("Mem3: " + (runtime.totalMemory() - runtime.freeMemory()));
outputStream.flush();
outputStream.close();
workbook.dispose();
} catch (IOException e) {
System.out.println("error releasing respurces: " + e.getClass().getSimpleName() +
e.getMessage());
}
}
Here are the run results -
Mem1: 112MB
Mem2: 464MB
Mem3: 697MB
Size of original "inputFile.xlsx" is 223KB.
As can be seen from the run results, calling workbook.write() is taking a lot of memory, is there a way to write to an excel file without using extra memory.
Main goal is to reduce both Mem2 and Mem3 of run results.

Related

zip file instead of an excel file when using XSSFWorkbook in java how do I open this

I am trying to write an excel file using java. I'm looking for just simply a column with one username per row right now, and then will build upon this later once I understand what is going a bit better. I get a zip file instead of the expected excel file, and it contains docProps, _rels, xl, and [Content_Types].xml. I don't understand how to open this zip file as though it is an excel file. I have not had luck finding the answer as all the tutorials I see show it to be a straight forward excel file, not a zip file. Is it a configuration I'm missing or is it to do with linux?
Here's my code, and what I end up with:
private void createExcelSheet(Assignment assignment) throws FileNotFoundException {
String excelFilePath = Configuration.DIRECTORY_ROOT+"/tests/"+assignment.getAssn_number()+"/gradebook-"+assignment.getAssn_number();
int rowNum = 0;
int col = 0;
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet spreadsheet = workbook.createSheet(assignment.getAssn_number()+" Grades ");
XSSFRow row = spreadsheet.createRow(rowNum);
Cell cell;
for (User user : userService.getUsers() ) {
row = spreadsheet.createRow(rowNum);
cell = row.createCell(rowNum);
cell.setCellValue(user.getStudent_id());
rowNum++;
}
try (FileOutputStream fos = new FileOutputStream(excelFilePath)) {
workbook.write(fos);
} catch (IOException e) {
throw new RuntimeException(e);
}
}

How can I load CSV file into Excel sheet using Java

I have an Excel spreadsheet that has the first sheet designated for the raw data. There are 3 more sheets that are coded to transform and format the data from the raw sheet. The fifth sheet has the final output.
How can I use Java:
load the data from the CSV file into the first sheet of the excel file?
save the data from the 5th sheet into the new CSV file.
Also, if the original CSV has thousands of rows, I assume the multi-sheet transformations would take some time before the 5th sheet gets all the final data - is there a way to know?
I would follow this approach:
Load the specific .csv file and prepare to read it with Java
Load the .xlsx file and change it according to your requirements and the data that you get from the .csv file. A small example of how an excel file is changed with Apache POI can be seen below:
try
{
HashMap<Integer, ArrayList<String>> fileData; // This for example keeps the data from the csv in this form ( 0 -> [ "Column1", "Column2" ]...)
// Working with the excel file now
FileInputStream file = new FileInputStream("Data.xlsx");
XSSFWorkbook workbook = new XSSFWorkbook(file); // getting the Workbook
XSSFSheet sheet = workbook.getSheetAt(0);
Cell cell = null;
AtomicInteger row = new AtomicInteger(0);
fileData.forEach((key, csvRow) ->
{
//Update the value of the cell
//Retrieve the row and check for null
HSSFRow sheetRow = sheet.getRow(row);
if(sheetRow == null)
{
sheetRow = sheet.createRow(row);
}
for (int i = 0; i < csvRow.size(); i++)
{
//Update the value of cell
cell = sheetRow.getCell(i);
if(cell == null){
cell = sheetRow.createCell(i);
}
cell.setCellValue(csvRow.get(i));
}
});
file.close();
FileOutputStream outFile =new FileOutputStream(new File("Data.xlsx"));
workbook.write(outFile);
outFile.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
After saving the .xlsx file, you can create the .csv file by following this question.

Why does not load infomation from excel (*.xlsx) (use Apache POI)?

Try to load information on JSP from Excel file (*.xlsx) use Apache POI 3.15.
View information in excel
num solution
1 First
2 Second
3 Third
try to use code
try {
InputStream ExcelFileToRead = new FileInputStream("C:\\server\\to_db.xlsx");
XSSFWorkbook wb = new XSSFWorkbook(ExcelFileToRead);
XSSFWorkbook test = new XSSFWorkbook();
XSSFSheet sheet = wb.getSheetAt(0);
XSSFRow row;
XSSFCell cell;
Iterator rows = sheet.rowIterator();
while (rows.hasNext()) {
row = (XSSFRow) rows.next();
Iterator cells = row.cellIterator();
while (cells.hasNext()) {
cell = (XSSFCell) cells.next();
if (cell.getCellType() == XSSFCell.CELL_TYPE_STRING) {
out.print(cell.getStringCellValue() + " ");
} else if (cell.getCellType() == XSSFCell.CELL_TYPE_NUMERIC) {
out.print(cell.getNumericCellValue() + " ");
} else {
//U Can Handel Boolean, Formula, Errors
}
}
out.println("Succefully!!!");
}
}
catch (Exception e) {
out.println( "exception: "+e);
}
Getting a strange result:
absent error and absent information on JSP....
Problem problem is reproduced in all browsers.
If I try open C:\server\to_db.xlsx OS Windows responce "File is busy".
What could be the problem and how to solve it?
When you open a FileInputStream or FileOutputStream, you need to close it otherwise your file could be locked by the process according to the OS used especially on Windows OS. More generally speaking, you need to close all Closeable objects that you use to prevent any leaks or issues like this one.
So you should rewrite your code as next to use the try-with-resources statement that will automatically close the resources for you.
try (InputStream ExcelFileToRead = new FileInputStream("C:\\server\\to_db.xlsx");
XSSFWorkbook wb = new XSSFWorkbook(ExcelFileToRead)) {
...
Indeed in your code you have 2 Closeable objects which are ExcelFileToRead and wb.
I think you have to close the file using ".close() ". This may help you.
In lib directory absent some commons libs
commons-collections4-4.1.jar, commons-codec-1.10.jar, commons-fileupload-1.3.jar
their presence solved problem.

Avoid overwriting xlsx file in Apache poi

I am using Apache poi to write in to an excel and given download option to that file. But each time when I download, it's overwriting the existing file and even file size is also increasing.
I want to create a new file by same name each time.
ServletContext servletContext = httpSession.getServletContext()
String absolutePathToIndexJSP = servletContext.getRealPath("/") + "File/filename.xlsx
FileInputStream fis = new FileInputStream(new File(absolutePathToIndexJSP));
System.out.println("file path : " + absolutePathToIndexJSP);
XSSFWorkbook workbook = new XSSFWorkbook(fis);
XSSFSheet sheet = workbook.getSheetAt(0);
XSSFCellStyle style = workbook.createCellStyle();
style.setAlignment(XSSFCellStyle.ALIGN_RIGHT);
XSSFRow row = sheet.createRow(0);
row.setHeight((short) 2000);
XSSFCell r1c = row.createCell(0);
row.removeCell(r1c);
r1c.setCellValue("Ptoto");
for (int s = 0; s < arrayJson.length(); s++) {
System.out.println(s);
int imageCount = s + 1;
System.out.println(imageCount);
String absolutePathToImage = servletContext.getRealPath("/") + "imgData/" + imageCount + ".jpg";
System.out.println("writing image");
System.out.println("path : " + absolutePathToImage);
InputStream inputStream = new FileInputStream(absolutePathToImage);
byte[] bytes = IOUtils.toByteArray(inputStream);
int pictureIdx = workbook.addPicture(bytes, Workbook.PICTURE_TYPE_JPEG);
inputStream.close();
CreationHelper helper = workbook.getCreationHelper();
Drawing drawing = null;
drawing = sheet.createDrawingPatriarch();
ClientAnchor anchor = helper.createClientAnchor();
row.removeCell(r1c);
anchor.setCol1(s + 1);
anchor.setRow1(0);
double scale = 0.11;
//Creates a picture
Picture pict = drawing.createPicture(anchor, pictureIdx);
//Reset the image to the original size
pict.resize(scale);
}
fos = new FileOutputStream(absolutePathToIndexJSP);
System.out.println("file written");
workbook.write(fos);
fos.flush();
fos.close();
From what you're asking, you basically just want to delete the old file and create a new one each time, right?
If you're not concerned with possible collisions (two users attempting to download the same source location at the same time) then you could use this delete file method to delete the file and then create a new one. So, where you have
new File(absolutePathToIndexJSP)
You should instantiate that, call the delete method, and then use it.

XSSFWorkbook takes lot time to load a excel (.xlsx) file

I am using following code in my program
FileInputStream fileFile = null;
try {
fileFile = new FileInputStream(new File("D:\\work\\result\\n01jfvjnjn.xlsx"));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
XSSFWorkbook workbookFile = null;
try {
workbookFile = new XSSFWorkbook(fileFile);
} catch (IOException e) {
e.printStackTrace();
}
XSSFSheet sheetFile = workbookFile.getSheet("Sheet1");
this xlsx file has 20 sheets and each sheet has 100 rows , its some what of 5mb file . I just go to specific sheet and print the first row and first column value it takes nearly 30 secs .
much time taken in the XSSFWorkbook line . I allocated 3gb of heap and i tried below code no difference .
File file = new File("C:\\D\\Data Book.xlsx");
OPCPackage opcPackage = OPCPackage.open(file);
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);
is there any better way to do this ?
Heres a link to a number of classes, handling the big Xlsx problem in a few steps. You only need to handle the Array[String]'s you get from it and put them in a list of String Arrays.
Link: http://lchenaction.blogspot.nl/2013/12/how-to-read-super-large-excel-and-csv.html

Categories

Resources