How To Append the Large Text Files to JTextArea in java Swing

How To Append the Large Text Files to JTextArea in java Swing - java

I have implemented an Java Swing Application.In that I have wrote Open File Functionality.I have tried with lot of ways to read the file and write into the JTextArea(I have tried with append(),setText() and read() method also).But,It working upto 100 MB.If I want to open over 100 MB file It raises an "out of Memory Exception : Java Heap space" at textarea.append().Is there any way to append over 100MB data to JTextArea or Anyway to Increase the Memory capacity of JTextArea.Please give a Suggestions for the above issue.Thanking You.

Possibly a duplicate of Java using up far more memory than allocated with -Xmx as your problem is really that your java-instance is running out of memory.
Java can open (theoretically) files of any size, as long as you have the memory for it to be read.
I would however recommend that you only read in parts of a file in memory at a time. And when you've finish with that part you move on to the next specified amount of text.
Anyhow, for this instance and if this is not a regular problem, you could use -Xmx800m which would let java use 800mb for heap space.
If this is not a one time thing, you really should look in to just reading in parts of a file at a time. http://www.baeldung.com/java-read-lines-large-file should put you in the right direction.

Related

Apache NiFi - OutOfMemory Error: GC overhead limit exceeded on SplitText processor

I am trying to use NiFi to process large CSV files (potentially billions of records each) using HDF 1.2. I've implemented my flow, and everything is working fine for small files.
The problem is that if I try to push the file size to 100MB (1M records) I get a java.lang.OutOfMemoryError: GC overhead limit exceeded from the SplitText processor responsible of splitting the file into single records. I've searched for that, and it basically means that the garbage collector is executed for too long without obtaining much heap space. I expect this means that too many flow files are being generated too fast.
How can I solve this? I've tried changing nifi's configuration regarding the max heap space and other memory-related properties, but nothing seems to work.
Right now I added an intermediate SplitText with a line count of 1K and that allows me to avoid the error, but I don't see this as a solid solution for when the incoming file size will become potentially much more than that, I am afraid I will get the same behavior from the processor.
Any suggestion is welcomed! Thank you

The reason for the error is when splitting 1M records with a line count of 1, you are creating 1M flow files which equate 1M Java objects. Overall the approach of using two SplitText processors is common and avoids creating all of the objects at the same time. You could probably use an even larger split size on the first split, maybe 10k. For a billion records I am wondering if a third level would make sense, split from 1B to maybe 10M, then 10M to 10K, then 10K to 1, but I would have to play with it.
Some additional things to consider are increasing the default heap size from 512MB, which you may have already done, and also figuring out if you really need to split down to 1 line. It is hard to say without knowing anything else about the flow, but in a lot of cases if you want to deliver each line somewhere you could potentially have a processor that reads in a large delimited file and streams each line to the destination. For example, this is how PutKafka and PutSplunk work, they can take a file with 1M lines and stream each line to the destination.

I had a similar error while using the GetMongo processor in Apache NiFi.
I changed my configurations to:
Limit: 100
Batch Size: 10
Then the error disappeared.

Out of memory while indexing with Lucene

I'm using Lucene 4.9.0 to index 23k files, but now I'm receiving java.lang.OutOfMemoryError: Java heap space message .
I don't want to increase "heap size" because the number of files tends to increase everyday.
How can I index all files without the OOM problem and increase "heap space"?

Your question is too vague and makes little sense.
First of all, 23K files can be 1 byte/each or 1G/each. How are we supposed to know what's inside and how heavyweight they are?
Secondly, you say
I don't want to increase "heap size" because <...>
and straight after you say
How can I index all files without the OOM problem and increase "heap space"
Can you make up your mind on whether you can increase heap space or not?
There's a certain amount of memory required to index the data, and there's nothing much you can do about it. That said, the most memory required is during merging process and you can play with the merge factor to see if this helps you.

Fast jvm start / jvm persistancy - starting jvm with data from heap dump

I am developing an in memory data structure, and would like to add persistency.
I am looking for a way to do this fast. I thought about dumpping a heap-dump once in a while.
Is there a way to load this java heap dump as is, into my memory? or is it impossible?
Otherwise, other suggestions for fast write and fast read of the entire information?
(Serialization might take a lot of time)
-----------------edited explination:--------
Since my memory might be full of small pieces of information, referencing each other - and so serialization may require me to in efficeintly scan all my memory. reloading is also possibly problematic.
On the other hand, I can define a gigantic array, and each object I create, I shall put it in the array. Links will be a long number, reperesnting the place in the array. Now, I can just dump this array as is - and also reload it as is.
There are even some jvms like JRockit that utilize the disk space, and so maybe it is possible maybe to dump as is very quickly and to re-load very quicky.
To prove my point, java dump contains all the information of the jvm, and it is produced quickly.
Sorry, but serialization of 4GB isn't even close to being in the seconds dump is.
Also, memory is memory and there are operating systems that allow you a ram memory dump quicky.
https://superuser.com/questions/164960/how-do-i-dump-physical-memory-in-linux
When you think about it... this is quite a good strategy for persistant data structures. There is quite a hype about in-memory data bases in the last decade. But why settel for that? What if I want a fibonacci heap - to be "almost persistant". That is, every 5 minutes I will dump the inforamtion (quickly) and in case of a electrical outage, I have a backup from 5 minutes ago.
-----------------end of edited explination:--------
Thank you.

In general, there is no way to do this on HotSpot.
Objects in the heap have 2 words of header, the second of which points into permgen for the class metadata (known as a klassOop). You would have to dump all of permgen as well, which includes all the pointers to compiled code - so basically the entire process.
There would be no sane way to recover the heap state correctly.
It may be better to explain precisely what you want to build & why already-existing products don't do what you need.

Use Serialization. Implement java.io.Serializable, add serialVersionUID to all of your classes, and you can persist them to any OutputStream (file, network, whatever). Just create a starting object from where all your object are reachable (even indirectly).
I don't think that Serialization would take long time, it's optimized code in the JVM.

You can use jhat or jvisualvm to load your dump to analyze it. I don't know whether the dump file can be loaded and restarted again.

Why does java Grep crash with OutOfMemoryError?

I'm running the following code more or less out of the box
http://download.oracle.com/javase/1.4.2/docs/guide/nio/example/Grep.java
I'm using the following VM arguments
-Xms756m -Xmx1024m
It crashes with OutOfMemory on a 400mb file. What am I doing wrong?
Stack trace:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapCharBuffer.<init>(Unknown Source)
at java.nio.CharBuffer.allocate(Unknown Source)
at java.nio.charset.CharsetDecoder.decode(Unknown Source)
at com.alluvialtrading.tools.Importer.<init>(Importer.java:46)
at com.alluvialtrading.tools.ReutersImporter.<init>(ReutersImporter.java:24)
at com.alluvialtrading.tools.ReutersImporter.main(ReutersImporter.java:20)

You are not doing anything wrong.
The problem is that the application maps the entire file into memory, and then creates a 2nd in-heap copy of the file. The mapped file is not consuming heap space, though it does use part of the JVM's virtual address space.
It is the 2nd copy, and the process of creating it that is actually filling the heap. The 2nd copy contains the file content expanded into 16-bit characters. A contiguous array of ~400 million characters (800 million bytes) is too big for a 1Gb heap, considering how the heap spaces are partitioned.
In short, the application is simply using too much memory.
You could try increasing the maximum heap size, but the real problem is that the application is too simple-minded in the way it manages memory.
The other point to make is application you are running is an example designed to illustrate how to use NIO. It is not designed to be a general purpose, production quality utility. You need to adjust your expectations accordingly.

Probably because 400Mb file is loaded into CharBuffer, so it takes twice as much memory in UTF16 encoding. So it does not leave much memory for the pattern matcher.
If you're using latest versions of java, try -XX:+UseCompressedStrings so that it represents strings internally as byte arrays and consumes less memory. You might have to put CharBuffer into a String.
So the exception is
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
at java.nio.CharBuffer.allocate(CharBuffer.java:329)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:777)
at Grep.grep(Grep.java:118)
at Grep.main(Grep.java:136)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
line under question is the constructor of HeapCharBuffer:
super(-1, 0, lim, cap, new char[cap], 0);
Which means it cannot create a char array of the size of the file.
If you want to grep large files in java, you'd need to find some algorithm that accepts a Reader of some sort. Standard java library does not have such functionality.

I would assume because the class as given loads the ENTIRE file into memory. Exactly where I'm not sure as I do not know the Java NIO classes. I would suspect though classes like MappedByteBuffer and CharBuffer might be the issue.
A stack trace might be able to tell you where its coming from.

Java heap size usage

I've written a simple application that works with database. My program have a table to show data from database. When I try to expand frame the program fails with OutOfMemory error, but if i don't try to do this, it works well.
I start my program with -Xmx4m parametre. Does it really need more than 4 megabytes to be in expanded state?
Another question: if I run the java visualVM I see the saw-edged chart of the heap usage of my program while other programs which is using java VM(such as netbeans) have more rectilinear charts. Why is heap usage of my program so unstable even if it does nothing(only waiting for user to push a button)?

You may want to try setting this value to generate a detailed heap dump to show you exactly what is going on.
-XX:+HeapDumpOnOutOfMemoryError
A typical "small" Java desktop application in 2011 is going to run with ~64-128MB. Unless you have a really pressing need, I would start by leaving it set to the default (i.e. no setting).
If you are trying to do something different (e.g. run this on an Android device), you are going to need to get very comfortable with profiling (and you should probably post with that tag).
Keep in mind that your 100 record cache (~12 bytes) may (probably) is double that if you are storing character data (Java uses UCS-16 internally).
RE: the "unstability", the JVM is going handling memory usage for you, and will perform garbage collection according to whatever algos it chooses (these have changed dramatically over the years). The graphing may just be an artifact of the tool and the sample period. The performance in a desktop app is affected by a huge number of factors.
As an example, we once had a huge memory "leak" that only showed up in one automated test but never showed up in normal real world usage. Turned out the test left the mouse hovering over a tool tip which included the name of the open file, which in turn had a set of references back to the entire (huge) project. Wiggling the mouse a few pixels got rid of the tooltip, which meant that the references all cleared up and the garbage collector took out the trash.
Moral of the story? You need to capture the exact heap dump at time of the out-of-memory and review it very carefully.

Why would you set your maximum heap size to 4 megabytes? Java is often memory intensive, so setting it at such a ridiculously low level is a recipe for disaster.
It also depends on how many objects are being created and destroyed by your code, and the underlying Swing (I am assuming) components use components to draw the elements, and how these elements are created and destroyed each time a components is redrawn.
Look at the CellRenderer code and this will show you why objects are being created and destroyed often, and why the garbage collector does such a wonderful job.
Try playing with the Xmx setting and see how the charts flatten out. I would expect Xmx64m or Xmx128m would be suitable (although the amount of data coming out of your database will obviously be an important contributing factor.

You may need more than 4Mb for a GUI with an expanded screen if you are using a double buffer. This will generate multiple image of the UI. It does this to show them quickly on the screen. Usually this is done assuming you have lots and lots of memory.
The Sawtooth memory allocation is due to something being done, then garbage collected. This may be on a repaint operation or other timer. Is there a timer in your code to check some process or value being changed. Or have you added code to a object repaint or other process?

I think 4mb is too small for anything except a trivial program - for example lots of GUI libraries (Swing included) will need to allocate temporary working space for graphics that alone may exceed that amount.
If you want to avoid out of memory errors but also want to avoid over-allocating memory to the JVM, I'd recommend setting a large maximum heap size and a small initial heap size.
Xmx (the maximum heap size) should
generally be quite large, e.g. 256mb
Xms (the initial heap size) can be
much smaller, 4mb should work -
though remember that if the application needs more
than this there will be a temporary performance
hit while it is resized

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.