I am trying to read excel file...make some changes...save to new file.
I have created small form with button...On pressing button..
It will load Excel file and load all data to Array list of class I have created.
It will loop through Array list and change few properties in objects.
It will save data to new Excel file.
Finally, it will clear Array list and show message box of completion.
Now the problem is memory issue.
When form is loaded, I can see in windows task manager...javaw is using around 23MB.
During read and write excel...memory shoots upto 170MB.
After array list is cleared....Memory is not clearing up and stays around 150MB.
Following code is attached to Event to button click.
MouseListener mouseListener = new MouseAdapter() {
public void mouseReleased(MouseEvent mouseEvent) {
if (SwingUtilities.isLeftMouseButton(mouseEvent)) {
ArrayList<Address> addresses = ExcelFunctions.getExcelData(fn);
for (Address address : addresses){
address.setZestimate(Integer.toString(rnd.nextInt(45000)));
address.setRedfinestimate(Integer.toString(rnd.nextInt(45000)));
}
ExcelFunctions.saveToExcel(ofn,addresses);
addresses.clear();
JOptionPane.showMessageDialog(null, "Done");
}
}
};
The code for Reading/Excel file in this Class.
public class ExcelFunctions {
public static ArrayList<Address> getExcelData(String fn)
{
ArrayList<Address> output = new ArrayList<Address>();
try
{
FileInputStream file = new FileInputStream(new File(fn));
//Create Workbook instance holding reference to .xlsx file
XSSFWorkbook workbook = new XSSFWorkbook(file);
//Get first/desired sheet from the workbook
XSSFSheet sheet = workbook.getSheetAt(0);
System.out.println(sheet.getSheetName());
//Iterate through each rows one by one
Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext())
{
Row row = rowIterator.next();
int r = row.getRowNum();
int fc= row.getFirstCellNum();
int lc = row.getLastCellNum();
String msg = "Row:"+ r +"FColumn:"+ fc + "LColumn"+lc;
System.out.println(msg);
if (row.getRowNum() > 0) {
Address add = new Address();
Cell c0 = row.getCell(0);
Cell c1 = row.getCell(1);
Cell c2 = row.getCell(2);
Cell c3 = row.getCell(3);
Cell c4 = row.getCell(4);
Cell c5 = row.getCell(5);
if (c0 != null){c0.setCellType(Cell.CELL_TYPE_STRING);add.setState(c0.toString());}
if (c1 != null){c1.setCellType(Cell.CELL_TYPE_STRING);add.setCity(c1.toString());}
if (c2 != null){c2.setCellType(Cell.CELL_TYPE_STRING);add.setZipcode(c2.toString());}
if (c3 != null){c3.setCellType(Cell.CELL_TYPE_STRING);add.setAddress(c3.getStringCellValue());}
if (c4 != null){c4.setCellType(Cell.CELL_TYPE_STRING);add.setZestimate(c4.getStringCellValue());}
if (c5 != null){c5.setCellType(Cell.CELL_TYPE_STRING);add.setRedfinestimate(c5.getStringCellValue());}
output.add(add);
c0=null;c1=null;c2=null;c3=null;c4=null;c5=null;
}
}
workbook.close();
file.close();
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
return output;
}
public static void saveToExcel(String ofn, ArrayList<Address> addresses) {
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = workbook.createSheet("Addresses");
Row header = sheet.createRow(0);
header.createCell(0).setCellValue("State");
header.createCell(1).setCellValue("City");
header.createCell(2).setCellValue("Zip");
header.createCell(3).setCellValue("Address");
header.createCell(4).setCellValue("Zestimates");
header.createCell(5).setCellValue("Redfin Estimate");
int row = 1;
for (Address address : addresses){
Row dataRow = sheet.createRow(row);
dataRow.createCell(0).setCellValue(address.getState());
dataRow.createCell(1).setCellValue(address.getCity());
dataRow.createCell(2).setCellValue(address.getZipcode());
dataRow.createCell(3).setCellValue(address.getAddress());
dataRow.createCell(4).setCellValue(address.getZestimate());
dataRow.createCell(5).setCellValue(address.getRedfinestimate());
row++;
}
try {
FileOutputStream out = new FileOutputStream(new File(ofn));
workbook.write(out);
out.close();
workbook.close();
System.out.println("Excel with foumula cells written successfully");
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}}
I am unable to figure out where the issue is.
I m closing workbook/inputstream/outputstream and clearing Arraylist too.
You probably don't have a memory leak...
When form is loaded, I can see in windows task manager...javaw is
using around 23MB. During read and write excel...memory shoots upto
170MB. After array list is cleared....Memory is not clearing up and
stays around 150MB.
This doesn't describe a memory leak - Task Manager is showing you the memory reserved by the process - not the application heap space.
Your JVM will allocate heap up to its configured maximum, say 200 MiB. Generally, after this memory is allocated from the OS, the JVM doesn't give it back (very often). However, if you look at your heap usage (with a tool like JConsole or JVisual VM) you'll see that the heap is reclaimed after a GC.
How Java consumes memory
As a very basic example:
Image source: https://stopcoding.files.wordpress.com/2010/04/visualvm_hfcd4.png
In this example, the JVM has a 1 GiB max heap, and as the application needed more memory, 400 MiB was reserved from the OS (the orange area).
The blue area is the actual heap memory used by the application. The saw-tooth effect is the result of the garbage collection process reclaiming unused memory. Note that the orange area remains fairly static - it generally won't resize with each GC event...
within few seconds...it shoot upto 800MB and stays there till end....I
have not got any memory error
If you had a memory leak, you'd eventually get an out of memory error. A "leak" (in Java at least) is when the application ties up memory in the heap, but doesn't release it for reuse by the application. If your observed memory shoots up that quickly, but the application doesn't fall over, you'll probably see that internally (in the JVM) memory is actually being released and reused.
Limiting how much (OS) memory Java can use
If you want to limit the memory your application can reserve from the OS, you need to configure your maximum heap size (via the -Xmx option) as well as your permanent generation size (if you're still using Java 7 or earlier). Note that the JVM uses some memory itself, so the value shown at OS level (using tools like Task Manager) can be higher than the sum of application memory you have specified.
Related
I have a list of strings in read from MongoDB (~200k lines)
Then I want to write it to an excel file with Java code:
public class OutputToExcelUtils {
private static XSSFWorkbook workbook;
private static final String DATA_SEPARATOR = "!";
public static void clusterOutToExcel(List<String> data, String outputPath) {
workbook = new XSSFWorkbook();
FileOutputStream outputStream = null;
writeData(data, "Data");
try {
outputStream = new FileOutputStream(outputPath);
workbook.write(outputStream);
workbook.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void writeData(List<String> data, String sheetName) {
int rowNum = 0;
XSSFSheet sheet = workbook.getSheet(sheetName);
sheet = workbook.createSheet(sheetName);
for (int i = 0; i < data.size(); i++) {
System.out.println(sheetName + " Processing line: " + i);
int colNum = 0;
// Split into value of cell
String[] valuesOfLine = data.get(i).split(DATA_SEPERATOR);
Row row = sheet.createRow(rowNum++);
for (String valueOfCell : valuesOfLine) {
Cell cell = row.createCell(colNum++);
cell.setCellValue(valueOfCell);
}
}
}
}
Then I get an error:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead
limit exceeded at
org.apache.xmlbeans.impl.store.Cur$Locations.(Cur.java:497) at
org.apache.xmlbeans.impl.store.Locale.(Locale.java:168) at
org.apache.xmlbeans.impl.store.Locale.getLocale(Locale.java:242) at
org.apache.xmlbeans.impl.store.Locale.newInstance(Locale.java:593) at
org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.newInstance(SchemaTypeLoaderBase.java:198)
at
org.apache.poi.POIXMLTypeLoader.newInstance(POIXMLTypeLoader.java:132)
at
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRst$Factory.newInstance(Unknown
Source) at
org.apache.poi.xssf.usermodel.XSSFRichTextString.(XSSFRichTextString.java:87)
at
org.apache.poi.xssf.usermodel.XSSFCell.setCellValue(XSSFCell.java:417)
at
ups.mongo.excelutil.OutputToExcelUtils.writeData(OutputToExcelUtils.java:80)
at
ups.mongo.excelutil.OutputToExcelUtils.clusterOutToExcel(OutputToExcelUtils.java:30)
at ups.mongodb.App.main(App.java:74)
Please give me some advice for that?
Thank you with my respect.
Update solution: Using SXSSWorkbook instead of XSSWorkbook
public class OutputToExcelUtils {
private static SXSSFWorkbook workbook;
private static final String DATA_SEPERATOR = "!";
public static void clusterOutToExcel(ClusterOutput clusterObject, ClusterOutputTrade clusterOutputTrade,
ClusterOutputDistance ClusterOutputDistance, String outputPath) {
workbook = new SXSSFWorkbook();
workbook.setCompressTempFiles(true);
FileOutputStream outputStream = null;
writeData(clusterOutputTrade.getTrades(), "Data");
try {
outputStream = new FileOutputStream(outputPath);
workbook.write(outputStream);
workbook.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void writeData(List<String> data, String sheetName) {
int rowNum = 0;
SXSSFSheet sheet = workbook.createSheet(sheetName);
sheet.setRandomAccessWindowSize(100); // For 100 rows saved in memory, it will flushed after wirtten to excel file
for (int i = 0; i < data.size(); i++) {
System.out.println(sheetName + " Processing line: " + i);
int colNum = 0;
// Split into value of cell
String[] valuesOfLine = data.get(i).split(DATA_SEPERATOR);
Row row = sheet.createRow(rowNum++);
for (String valueOfCell : valuesOfLine) {
Cell cell = row.createCell(colNum++);
cell.setCellValue(valueOfCell);
}
}
}
}
Your application is spending too much time doing garbage collection. This doesn't necessarily mean that it is running out of heap space; however, it spends too much time in GC relative to performing actual work, so the Java runtime shuts it down.
Try to enable throughput collection with the following JVM option:
-XX:+UseParallelGC
While you're at it, give your application as much heap space as possible:
-Xms????m
(where ???? stands for the amount of heap space in MB, e.g. -Xms8192m)
If this doesn't help, try to set a more lenient throughput goal with this option:
-XX:GCTimeRatio=19
This specifies that your application should do 19 times more useful work than GC-related work, i.e. it allows the GC to consume up to 5% of the processor time (I believe the stricter 1% default goal may be causing the above runtime error)
No guarantee that his will work. Can you check and post back so others who experience similar problems may benefit?
EDIT
Your root problem remains the fact that you need to hold the entire spreadhseet and all its related objects in memory while you are building it. Another solution would be to serialize the data, i.e. writing the actual spreadsheet file instead of constructing it in memory and saving it at the end. However, this requires reading up on the XLXS format and creating a custom solution.
Another option would be looking for a less memory-intensive library (if one exists). Possible alternatives to POI are JExcelAPI (open source) and Aspose.Cells (commercial).
I've used JExcelAPI years ago and had a positive experience (however, it appears that it is much less actively maintained than POI, so may no longer be the best choice).
EDIT 2
Looks like POI offers a streaming model (https://poi.apache.org/spreadsheet/how-to.html#sxssf), so this may be the best overall approach.
Well try to not load all the data in memory. Even if the binary representation of 200k lines is not that big the hidrated object in memory may be too big. Just as a hint if you have a Pojo each attribute in this pojo has a pointer and each pointer depending on if it is compressed or not compressed will take 4 or 8 bytes. This mean that if your data is a Pojo with 4 attributes only for the pointers you will be spending 200 000* 4bytes(or 8 bytes).
Theoreticaly you can increase the amount of memory to the JVM, but this is not a good solution, or more precisly it is not a good solution for a Live system. For a non interactive system might be fine.
Hint: Use -Xmx -Xms jvm arguments to control the heap size.
Instead of getting the entire list from the data, iterate line wise.
If too cumbersome, write the list to a file, and reread it linewise, for instance as a Stream<String>:
Path path = Files.createTempFile(...);
Files.write(path, list, StandarCharsets.UTF_8);
Files.lines(path, StandarCharsets.UTF_8)
.forEach(line -> { ... });
On the Excel side: though xlsx uses shared strings, if XSSF was done careless,
the following would use a single String instance for repeated string values.
public class StringCache {
private static final int MAX_LENGTH = 40;
private Map<String, String> identityMap = new Map<>();
public String cached(String s) {
if (s == null) {
return null;
}
if (s.length() > MAX_LENGTH) {
return s;
}
String t = identityMap.get(s);
if (t == null) {
t = s;
identityMap.put(t, t);
}
return t;
}
}
StringCache strings = new StringCache();
for (String valueOfCell : valuesOfLine) {
Cell cell = row.createCell(colNum++);
cell.setCellValue(strings.cached(valueOfCell));
}
So I'm making a large-scale prime number generator in Java (with the help of JavaFX).
It uses the Apache POI library (I believe I'm using v3.17) to output the results to Excel spreadsheets.
The static methods for this exporting logic are held in a class called ExcelWriter. Basically, it iterates through an Arraylist arguments and populates a XSSFWorkbook with it's contents. Afterwords, a FileOutputStream is used to actually make it an excel file. Here are the relevant parts of it:
public class ExcelWriter {
//Configured JFileChooser to make alert before overwriting old files
private static JFileChooser fileManager = new JFileChooser(){
#Override
public void approveSelection(){
...
}
};
private static FileFilter filter = new FileNameExtensionFilter("Excel files","xlsx");
private static boolean hasBeenInitialized = false;
//Only method that can be called externally to access this class's functionality
public static <T extends Object> void makeSpreadsheet
(ArrayList<T> list, spreadsheetTypes type, int max, String title, JFXProgressBar progressBar)
throws IOException, InterruptedException{
progressBar.progressProperty().setValue(0);
switch (type){
case rightToLeftColumnLimit:
makeSpreadsheetRightToLeft(list, false, max, title, progressBar);
break;
...
}
}
static private <T extends Object> void makeSpreadsheetRightToLeft
(ArrayList<T> list, boolean maxRows, int max, String title, JFXProgressBar progressBar)
throws IOException, InterruptedException{
initializeChooser();
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = workbook.createSheet("Primus output");
int rowPointer = 0;
int columnPointer = 0;
double progressIncrementValue = 1/(double)list.size();
//Giving the spreadsheet an internal title also
Row row = sheet.createRow(0);
row.createCell(0).setCellValue(title);
row = sheet.createRow(++rowPointer);
//Making the sheet with a max column limit
if (!maxRows){
for (T number: list){
if (columnPointer == max){
columnPointer = 0;
row = sheet.createRow(++rowPointer);
}
Cell cell = row.createCell(columnPointer++);
progressBar.setProgress(progressBar.getProgress() + progressIncrementValue);
cell.setCellValue(number.toString());
}
}else {
//Making the sheet with a max row limit
int columnWrapIndex = (int)Math.ceil(list.size()/(float)max);
for (T number: list){
if (columnPointer == columnWrapIndex){
columnPointer = 0;
row = sheet.createRow(++rowPointer);
}
Cell cell = row.createCell(columnPointer++);
progressBar.setProgress(progressBar.getProgress() + progressIncrementValue);
cell.setCellValue(number.toString());
}
}
writeToExcel(workbook, progressBar);
}
static private void writeToExcel(XSSFWorkbook book, JFXProgressBar progressBar) throws IOException, InterruptedException{
//Exporting to Excel
int returnValue = fileManager.showSaveDialog(null);
if (returnValue == JFileChooser.APPROVE_OPTION){
File file = fileManager.getSelectedFile();
//Validation logic here
try{
FileOutputStream out = new FileOutputStream(file);
book.write(out);
out.close();
book.close();
}catch (FileNotFoundException ex){
}
}
}
}
Afterwards, my FXML document controller has a buttonListerner which calls:
longCalculationThread thread = new longCalculationThread(threadBundle);
thread.start();
The longcalculationthread creates a list of about a million prime numbers and Exports them to the ExcelWriter using this code:
private void publishResults() throws IOException, InterruptedException{
if (!longResults.isEmpty()){
if (shouldExport) {
progressText.setText("Exporting to Excel...");
ExcelWriter.makeSpreadsheet(longResults, exportType, excelExportLimit, getTitle(), progressBar);
}
}
The problem is, even though the variable holding the workbook in the XSSF workbook is a local variable to the methods it is used in, it doesn't get garbage collected afterwards.
It takes up like 1.5GB of RAM (I don't know why), and that data is only reallocated when another huge export is called (not for small exports).
My problem isn't really that the thing takes a lot of RAM, it's that even when the methods are completed the memory isn't GCed.
Here are some pictures of my NetBeans profiles:
Normal memory usage when making array of 1000000 primes:
Huge heap usage when making workbook
Memory Isn't reallocated when workbook ins't accessible anymore
Fluctuation seen when making a new workbook using the same static methods
I found the answer! I had to prompt the GC with System.gc(). I remember trying this out earlier, however I must have put it in a pace where the workbook was still accessible and hence couldn't be GCed.
I have developed my own Indexer in Lucene 5.2.1. I am trying to index a file of dimension of 1.5 GB and I need to do some non-trivial calculation during indexing time on every single document of the collection.
The problem is that it takes almost 20 minutes to do all the indexing! I have followed this very helpful wiki, but it is still way too slow. I have tried increasing Eclipse heap space and java VM memory, but it seems more a matter of hard disk rather than virtual memory (I am using a laptop with 6GB or RAM and a common Hard Disk).
I have read this discussion that suggests to use RAMDirectory or mount a RAM disk. The problem with RAM disk would be that of persisting index in my filesystem (I don't want to lose indexes after reboot). The problem with RAMDirectory instead is that, according to the APIs, I should not use it because my index is more than "several hundreds of megabites"...
Warning: This class is not intended to work with huge indexes. Everything beyond several hundred megabytes will waste resources (GC cycles), because it uses an internal buffer size of 1024 bytes, producing millions of byte[1024] arrays. This class is optimized for small memory-resident indexes. It also has bad concurrency on multithreaded environments.
Here you can find my code:
public class ReviewIndexer {
private JSONParser parser;
private PerFieldAnalyzerWrapper reviewAnalyzer;
private IndexWriterConfig iwConfig;
private IndexWriter indexWriter;
public ReviewIndexer() throws IOException{
parser = new JSONParser();
reviewAnalyzer = new ReviewWrapper().getPFAWrapper();
iwConfig = new IndexWriterConfig(reviewAnalyzer);
//change ram buffer size to speed things up
//#url https://wiki.apache.org/lucene-java/ImproveIndexingSpeed
iwConfig.setRAMBufferSizeMB(2048);
//little speed increase
iwConfig.setUseCompoundFile(false);
//iwConfig.setMaxThreadStates(24);
// Set to overwrite the existing index
indexWriter = new IndexWriter(FileUtils.openDirectory("review_index"), iwConfig);
}
/**
* Indexes every review.
* #param file_path : the path of the yelp_academic_dataset_review.json file
* #throws IOException
* #return Returns true if everything goes fine.
*/
public boolean indexReviews(String file_path) throws IOException{
BufferedReader br;
try {
//open the file
br = new BufferedReader(new FileReader(file_path));
String line;
//define fields
StringField type = new StringField("type", "", Store.YES);
String reviewtext = "";
TextField text = new TextField("text", "", Store.YES);
StringField business_id = new StringField("business_id", "", Store.YES);
StringField user_id = new StringField("user_id", "", Store.YES);
LongField stars = new LongField("stars", 0, LanguageUtils.LONG_FIELD_TYPE_STORED_SORTED);
LongField date = new LongField("date", 0, LanguageUtils.LONG_FIELD_TYPE_STORED_SORTED);
StringField votes = new StringField("votes", "", Store.YES);
Date reviewDate;
JSONObject jsonVotes;
try {
indexWriter.deleteAll();
//scan the file line by line
//TO-DO: split in chunks and use parallel computation
while ((line = br.readLine()) != null) {
try {
JSONObject jsonline = (JSONObject) parser.parse(line);
Document review = new Document();
//add values to fields
type.setStringValue((String) jsonline.get("type"));
business_id.setStringValue((String) jsonline.get("business_id"));
user_id.setStringValue((String) jsonline.get("user_id"));
stars.setLongValue((long) jsonline.get("stars"));
reviewtext = (String) jsonline.get("text");
//non-trivial function being calculated here
text.setStringValue(reviewtext);
reviewDate = DateTools.stringToDate((String) jsonline.get("date"));
date.setLongValue(reviewDate.getTime());
jsonVotes = (JSONObject) jsonline.get("votes");
votes.setStringValue(jsonVotes.toJSONString());
//add fields to document
review.add(type);
review.add(business_id);
review.add(user_id);
review.add(stars);
review.add(text);
review.add(date);
review.add(votes);
//write the document to index
indexWriter.addDocument(review);
} catch (ParseException | java.text.ParseException e) {
e.printStackTrace();
br.close();
return false;
}
}//end of while
} catch (IOException e) {
e.printStackTrace();
br.close();
return false;
}
//close buffer reader and commit changes
br.close();
indexWriter.commit();
} catch (FileNotFoundException e1) {
e1.printStackTrace();
return false;
}
System.out.println("Done.");
return true;
}
public void close() throws IOException {
indexWriter.close();
}
}
What is the best thing to do then? Should I Build a RAM disk and then copy indexes to FileSystem once they are done, or should I use RAMDirectory anyway -or maybe something else? Many thanks
Lucene claims 150GB/hour on modern hardware - that is with 20 indexing threads on a 24 core machine.
You have 1 thread, so expect about 150/20 = 7.5 GB/hour. You will probably see that 1 core is working 100% and the rest is only working when merging segments.
You should use multiple index threads to speeds things up. See for example the luceneutil Indexer.java for inspiration.
As you have a laptop I suspect you have either 4 or 8 cores, so multi-threading should be able to give your indexing a nice boost.
You can try setMaxTreadStates in IndexWriterConfig
iwConfig.setMaxThreadStates(50);
I have an app that needs to access a large number of images very quickly, so I need to load those images into memory in some way. Doing so as bitmaps used over 100MB of RAM, which was completely out of the question, so I opted to read jpg files into memory, storing them inside a byteArray. Then I decode them and write them to the canvas as each is needed. This works pretty well, cutting out the slow disk access, while also respecting memory limits.
However, memory usage seems 'off' to me. I'm storing 450 jpgs with a file size of approximately 33kb each. This totals around 15MB of data. However, the app continually runs at between 35MB and 40MB of RAM as reported by both Eclipse DDMS and Android (on a physical device). I've tried modifying how many jpgs are loaded and the RAM used by the app tends to decrease by around 60-70kb per jpg, indicating that each image is stored twice in RAM. Memory usage does not fluctuate which implies that there is not an actual 'leak' involved.
Here is the relevant loading code:
private byte[][] bitmapArray = new byte[totalFrames][];
for (int x=0; x<totalFrames; x++) {
File file = null;
if (cWidth <= cHeight){
file = new File(directory + "/f"+x+".jpg");
} else {
file = new File(directory + "/f"+x+"-land.jpg");
}
bitmapArray[x] = getBytesFromFile(file);
imagesLoaded = x + 1;
}
public byte[] getBytesFromFile(File file) {
byte[] bytes = null;
try {
InputStream is = new FileInputStream(file);
long length = file.length();
bytes = new byte[(int) length];
int offset = 0;
int numRead = 0;
while (offset < bytes.length && (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}
if (offset < bytes.length) {
throw new IOException("Could not completely read file " + file.getName());
}
is.close();
} catch (IOException e) {
//TODO Write your catch method here
}
return bytes;
}
Eventually, they get written to screen like so:
SurfaceHolder holder = getSurfaceHolder();
Canvas c = null;
try {
c = holder.lockCanvas();
if (c != null) {
int canvasWidth = c.getWidth();
int canvasHeight = c.getHeight();
Rect destinationRect = new Rect();
destinationRect.set(0, 0, canvasWidth, canvasHeight);
c.drawBitmap(BitmapFactory.decodeByteArray(bitmapArray[bgcycle], 0, bitmapArray[bgcycle].length), null, destinationRect, null);
}
} finally {
if (c != null)
holder.unlockCanvasAndPost(c);
}
Am I correct that there is some sort of duplication going on here? Or is there just that much overhead involved in storing jpgs in a byteArray like this?
Storing bytes in RAM is very different to storing data on hard drives... There is alot more overhead to it. The references to the objects as well the byte array structures all take up additional memory. There isn't really a single source to all the additional memory but just remember than loading a file into RAM normally takes up 2 ~ 3x more space (from experience, I'm afraid I can't quote any documentation here).
Consider this:
File F = //Some file here (Less than 2 GB please)
FileInputStream fIn = new FileInputStream(F);
ByteArrayOutputStream bOut = new ByteArrayOutputStream(((int)F.length()) + 1);
int r;
byte[] buf = new byte[32 * 1000];
while((r = fIn.read(buf) != -1){
bOut.write(buf, 0, r);
}
//Do a memory measurement at this point. You'll see your using nearly 3x the memory in RAM compared to the file.
//If your actually gonna try this, remember to surround with try-catch and close the streams as appropriate.
Also remember that unused memory is not instantly cleared up. The method getBytesFromFile() may be returning a copy of a byte array which causes memory duplication which may not immediately be garbage collected. If you want to be safe, check the method getBytesFromFile(file) is not leaking any references that should be cleaned up. It won't appear as a memory leak as you only call it a finite number of times.
It might be because your byte array is 2 dimensional, you only need one dimension for loading an image using a byte array, and the second dimension could potentially double the Ram needed as for each byte you would have an empty but still existing byte that you don't use
I am getting Java Heap Space Error while writing large data from database to an excel sheet.
I dont want to use JVM -XMX options to increase memory.
Following are the details:
1) I am using org.apache.poi.hssf api
for excel sheet writing.
2) JDK version 1.5
3) Tomcat 6.0
Code i have wriiten works well for around 23 thousand records, but it fails for more than 23K records.
Following is the code:
ArrayList l_objAllTBMList= new ArrayList();
l_objAllTBMList = (ArrayList) m_objFreqCvrgDAO.fetchAllTBMUsers(p_strUserTerritoryId);
ArrayList l_objDocList = new ArrayList();
m_objTotalDocDtlsInDVL= new HashMap();
Object l_objTBMRecord[] = null;
Object l_objVstdDocRecord[] = null;
int l_intDocLstSize=0;
VisitedDoctorsVO l_objVisitedDoctorsVO=null;
int l_tbmListSize=l_objAllTBMList.size();
System.out.println(" getMissedDocDtlsList_NSM ");
for(int i=0; i<l_tbmListSize;i++)
{
l_objTBMRecord = (Object[]) l_objAllTBMList.get(i);
l_objDocList = (ArrayList) m_objGenerateVisitdDocsReportDAO.fetchAllDocDtlsInDVL_NSM((String) l_objTBMRecord[1], p_divCode, (String) l_objTBMRecord[2], p_startDt, p_endDt, p_planType, p_LMSValue, p_CycleId, p_finYrId);
l_intDocLstSize=l_objDocList.size();
try {
l_objVOFactoryForDoctors = new VOFactory(l_intDocLstSize, VisitedDoctorsVO.class);
/* Factory class written to create and maintain limited no of Value Objects (VOs)*/
} catch (ClassNotFoundException ex) {
m_objLogger.debug("DEBUG:getMissedDocDtlsList_NSM :Exception:"+ex);
} catch (InstantiationException ex) {
m_objLogger.debug("DEBUG:getMissedDocDtlsList_NSM :Exception:"+ex);
} catch (IllegalAccessException ex) {
m_objLogger.debug("DEBUG:getMissedDocDtlsList_NSM :Exception:"+ex);
}
for(int j=0; j<l_intDocLstSize;j++)
{
l_objVstdDocRecord = (Object[]) l_objDocList.get(j);
l_objVisitedDoctorsVO = (VisitedDoctorsVO) l_objVOFactoryForDoctors.getVo();
if (((String) l_objVstdDocRecord[6]).equalsIgnoreCase("-"))
{
if (String.valueOf(l_objVstdDocRecord[2]) != "null")
{
l_objVisitedDoctorsVO.setPotential_score(String.valueOf(l_objVstdDocRecord[2]));
l_objVisitedDoctorsVO.setEmpcode((String) l_objTBMRecord[1]);
l_objVisitedDoctorsVO.setEmpname((String) l_objTBMRecord[0]);
l_objVisitedDoctorsVO.setDoctorid((String) l_objVstdDocRecord[1]);
l_objVisitedDoctorsVO.setDr_name((String) l_objVstdDocRecord[4] + " " + (String) l_objVstdDocRecord[5]);
l_objVisitedDoctorsVO.setDoctor_potential((String) l_objVstdDocRecord[3]);
l_objVisitedDoctorsVO.setSpeciality((String) l_objVstdDocRecord[7]);
l_objVisitedDoctorsVO.setActualpractice((String) l_objVstdDocRecord[8]);
l_objVisitedDoctorsVO.setLastmet("-");
l_objVisitedDoctorsVO.setPreviousmet("-");
m_objTotalDocDtlsInDVL.put((String) l_objVstdDocRecord[1], l_objVisitedDoctorsVO);
}
}
}// End of While
writeExcelSheet(); // Pasting this method at the end
// Clean up code
l_objVOFactoryForDoctors.resetFactory();
m_objTotalDocDtlsInDVL.clear();// Clear the used map
l_objDocList=null;
l_objTBMRecord=null;
l_objVstdDocRecord=null;
}// End of While
l_objAllTBMList=null;
m_objTotalDocDtlsInDVL=null;
-------------------------------------------------------------------
private void writeExcelSheet() throws IOException
{
HSSFRow l_objRow = null;
HSSFCell l_objCell = null;
VisitedDoctorsVO l_objVisitedDoctorsVO = null;
Iterator l_itrDocMap = m_objTotalDocDtlsInDVL.keySet().iterator();
while (l_itrDocMap.hasNext())
{
Object key = l_itrDocMap.next();
l_objVisitedDoctorsVO = (VisitedDoctorsVO) m_objTotalDocDtlsInDVL.get(key);
l_objRow = m_objSheet.createRow(m_iRowCount++);
l_objCell = l_objRow.createCell(0);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(String.valueOf(l_intSrNo++));
l_objCell = l_objRow.createCell(1);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getEmpname() + " (" + l_objVisitedDoctorsVO.getEmpcode() + ")"); // TBM Name
l_objCell = l_objRow.createCell(2);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getDr_name());// Doc Name
l_objCell = l_objRow.createCell(3);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getPotential_score());// Freq potential score
l_objCell = l_objRow.createCell(4);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getDoctor_potential());// Freq potential score
l_objCell = l_objRow.createCell(5);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getSpeciality());//CP_GP_SPL
l_objCell = l_objRow.createCell(6);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getActualpractice());// Actual practise
l_objCell = l_objRow.createCell(7);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getPreviousmet());// Lastmet
l_objCell = l_objRow.createCell(8);
l_objCell.setCellStyle(m_objCellStyle4);
l_objCell.setCellValue(l_objVisitedDoctorsVO.getLastmet());// Previousmet
}
// Write OutPut Stream
try {
out = new FileOutputStream(m_objFile);
outBf = new BufferedOutputStream(out);
m_objWorkBook.write(outBf);
} catch (Exception ioe) {
ioe.printStackTrace();
System.out.println(" Exception in chunk write");
} finally {
if (outBf != null) {
outBf.flush();
outBf.close();
out.close();
l_objRow=null;
l_objCell=null;
}
}
}
Instead of populating the complete list in memory before starting to write to excel you need to modify the code to work in such a way that each object is written to a file as it is read from the database. Take a look at this question to get some idea of the other approach.
Well, I'm not sure if POI can handle incremental updates but if so you might want to write chunks of say 10000 Rows to the file. If not, you might have to use CSV instead (so no formatting) or increase memory.
The problem is that you need to make objects written to the file elligible for garbage collection (no references from a live thread anymore) before writing the file is finished (before all rows have been generated and written to the file).
Edit:
If can you write smaller chunks of data to the file you'd also have to only load the necessary chunks from the db. So it doesn't make sense to load 50000 records at once and then try and write 5 chunks of 10000, since those 50000 records are likely to consume a lot of memory already.
As Thomas points out, you have too many objects taking up too much space, and need a way to reduce that. There is a couple of strategies for this I can think of:
Do you need to create a new factory each time in the loop, or can you reuse it?
Can you start with a loop getting the information you need into a new structure, and then discarding the old one?
Can you split the processing into a thread chain, sending information forwards to the next step, avoiding building a large memory consuming structure at all?