Android OutOfMemoryError - Loading JSON File - java

The app I am working on needs to read a JSON file that may be anywhere from 1.5 to 3 MB in size. It seems to have no problem opening the file and converting the data to a string, but when it attempts to convert the string to a JSONArray, OutOfMemoryErrors are thrown. The exceptions look something like this:
E/dalvikvm-heap( 5307): Out of memory on a 280-byte allocation.
W/dalvikvm( 5307): Exception thrown (Ljava/lang/OutOfMemoryError;) while throwing internal exception (Ljava/lang/OutOfMemoryError;)
One strange thing about this is that the crash only occurs every 2nd or 3rd time the app is run, leaving me to believe that the memory consumed by the app is not being garbage collected each time the app closes.
Any insight into how I might get around this issue would be greatly appreciated. I am open to the idea of loading the file in chunks, but I'm not quite sure what the best approach is for such a task.
Thank you

When you say "2nd or 3rd its run" do you mean each time your starting with a fresh emulator? or do you mean leaving the application and coming back? (for instance pressing home, or calling finalize() )
If your referring to leaving the application and re launching it:
if you haven't set android:launchMode in your manifest to define the activity as singleInstance or singleTask then each time the application is launched a new activity is created and added to the activity stack. You could easily have multiple copies of your activity running in your application process eating a lot of memory.
If its happening the 2nd launch your still using a a lot memory and should break it down more.

One strange thing about this is that the crash only occurs every 2nd or 3rd time the app is run, leaving me to believe that the memory consumed by the app is not being garbage collected each time the app closes.
That is certainly possible, and if it is the case then it probably due to a memory leak that can be traced back to something that your application is doing. I think you should focus your initial efforts into investigating this aspect ... rather than loading the file in chunks. (I am not familiar with the Android tool-chain, but I am sure it includes memory usage profilers or memory dump analysers.)
EDIT
In response to your followup comment, the fact that it works 2 times in 3 suggests that your app ought to work roughly as-is. Admittedly, you don't have much leeway if the input file gets bigger.
A couple of ideas though:
Rather than reading the file into a String and running the JSON parser on the String, use a parser that can read directly from a stream. Your current solution needs space for two complete copies of the data in memory while you are doing the parsing.
If the file gets much bigger, you may need to think of a design that doesn't create a complete in-memory representation of the data.
I'm not sure that it is a good idea to read a JSON file in "chunks". This could present problems for parsing the JSON ... depending on exactly what you mean by reading in chunks.
EDIT 2
Maybe what you need is a "SAX like" JSON parser; e.g. http://code.google.com/p/async-json-library/

Try to use parse JSON Data Efficiently on Android, using JsonReader. It's like SAX parsing for XML.

Related

restart SAX parser from the middle of the document

I'm working on a project that needs to parse a very big XML file (about 10GB). Because process time is really long (about days), It's possible that my code exit in the middle of the process; so I want to save my code's status once in a while and then be able to restart it from last save point.
Is there a way to start (restart) a SAX parser not from the beginning of a XML file?
P.S: I'm programming using Python, but solutions for Java and C++ are also acceptable.
Not really sure if this answers your question, but I would take a different approach. 10GB is not THAT much data, so you could implement a two-phase parsing.
Phase 1 would be to split the file in smaller chunks based on some tag, so you end up with more smaller files. For example if your first file is A.xml, you split it to A_0.xml, A_1.xml etc.
Phase 2 would do the real heavy lifting on each chuck, so you invoke it on A_0.xml, then after that on A_1.xml etc. You could then restart on a chunk after your code has exitted.

Web services response conversion issue

I am using 'n' number of web services in my systems. I am very well taking care. But, in recent days I am just seeing a strange behaviour while handling response of one my external systems.
Here is my problem,
When I request one of my downstream system for getting data, i am getting response with one very big xml. During parsing the response in system, the complete JAVA thread itself got struck more than configured time. So for temporary fix, we request downstream system to limit the response.
But, how this is happening? Irrespective of how big the data, the unmarshlling process should complete right.
So may i know what was the root cause of this issue ?
If you are unmarshalling then the whole XML will be converted to one object graph containing all the objects specified in the XML. So the bigger the XML the bigger the resulting object graph. Of course this takes more memory, perhaps more than your application has to its disposition, which could lead to an OutOfMemoryException.
If the XML received contains some kind of a list of items you can consider handling it item by item. You will read in one item at a time and then process it and dispose of it. You will then need only the amount of memory to fit one item's object graph in memory. But to do this you would have to rewrite your processing code to use a library like SAX.

How to write more than 30 MB of data in xml?

First of all sorry if I'm repeating this question but I don't find any relevant solutions for my problem.
I'm facing difficulty in finding the way to solve the below issues.
1) I'm facing a scenario where I have to write more than 30 MB - 400 MB of data in a xml. When I'm using 'String' object to append the data to xml I'm getting 'OutOfMemory' exception.
After spending more time in doing R&D, I came to know that using 'Stream' will resolve this issue. But I'm not sure about this.
2) Once I constructed the xml, I have to send this data to the DMZ server using Android devices. As I know sending large amount of data using Http is difficult in this situation. In this case,
a) Using FTP will be helpful in this scenario?
b) Splitting the data into chunks of data and sending will be helpful?
Kindly let me know your suggestions. Thanks in advance.
i would consider zipping up the data before ftping it across.You could use a ZipOutputStream .
For the Out of Memory Exception, you could consider increasing the Heap Size.
Check this : Increase heap size in Java
Can you post some values of heap size you tried, your code and some exception traces?
Use StAX or SAX. These can create XML of any size because they write XML parts they generate to OutputStream on the fly.
What you should do is
First, use a XML parser to read and write data in XML format. it could be SAX or DOM. If data size is huge try CSV format it will take less space as you do not have to store XML tag.
Second, When creating output file make sure those are small small files.
third when sending over network, make sure you zipped everything.
And for god sake, don't eat up user mobile data cap for this design. Warn user about this file size and suggest him to use WiFi network.

Java Serialization - Recovering serialized file after process crash

I have a following usecase.
A process serializes certain objects to a file using BufferedOutputStream.
After writing each object, process invokes flush()
The use case is that if the process crashes while writing an object, I want to recover the file upto the previous object that has been written successfully.
How can I deserialize such file? How will Java behave while deserializing such file.
Will it successfully deserialize upto the object that were written successfully before crash?
While reading the last partially written object, what will be the behavior. How can I detect that?
Update1 -
I have tried to simulate process crash via manually killing the process while objects are being written. I have tried around 10-15 times.Each time i am able to deserialize the file and file does not has any partial object.
I am not sure if my test is exhaustive enough and therefore need further advice.
Update2 - Adam had pointed a way which could simulate such test using truncating the file randomly.
Following is the behavior observed for trying out around 100 iterations -
From the truncated file ( which should be equivalent to the condition of file when a process crashes), Java can read upto last complete object successfully.
Upon reaching the last partially written object, Java does not throw any StreamCorruptedException or IOException. It simply throws EOFException indicated EOF and ignores the partial object.
Each object is deserialized or not before reading the next one. It won't be impacted because a later object failed to be written or will fail to deserialize
I suspect you are misusing java serialization - it's not intended to be a reliable and recoverable means of permanent storage. Use a database for that. If you must, you can
use a database to store the serialized form of java objects, but that would be pretty inefficient.
Yeah, testing such scenario manually (by killing the process) may be difficult. I would suggest writing a test case, where you :
Serialize a set of objects and write them to a file .
Open the file and basically truncate it at random position.
Try to load and deserialize (and see what happens)
Repeat 1. to 3. with several other truncate positions.
This way you are sure that you are loading a broken file and that your code handles it properly.
Have you tried appending to ObjectOutputStream? You can find the solution HERE just find the post where explains how to create an ObjectOutputStream with append.

Moving files after failed validation (Java)

We are validating XML files and depending on the result of the validation we have to move the file into a different folder.
When the XML is valid the validator returns a value and we can move the file without a problem. Same thing happens when the XML is not valid according to the schema.
If however the XML is not well formed the validator throws an exception and when we try to move the file, it fails. We believe there is still a handle in the memory somewhere that keeps hold of the file. We tried putting System.gc() before moving the file and that sorted the problem but we can't have System.gc() as a solution.
The code looks like this. We have a File object from which we create a StreamSource. The StreamSource is then passed to the validator. When the XML is not well formed it throws a SAXException. In the exception handling we use the .renameTo() method to move the file.
sc = new StreamSource(xmlFile);
validator.validate(sc);
In the catch we tried
validator.reset();
validator=null;
sc=null;
but still .renameTo() is not able to move the file. If we put System.gc() in the catch, the move will succeed.
Can someone enlight me how to sort this without System.gc()?
We use JAXP and saxon-9.1.0.8 as the parser.
Many thanks
Try creating a FileInputStream and passing that into StreamSource then close the FileInputStream when you're done. By passing in a File you have lost control of how/when to close the file handle.
When you set sc = null, you are indicating to the garbage collector that the StreamSource file is no longer being used, and that it can be collected. Streams close themselves in their destroy() method, so if they are garbage collected, they will be closed, and therefore can be moved on a Windows system (you will not have this problem on a Unix system).
To solve the problem without manually invoking the GC, simply call sc.getInputStream().close() before sc = null. This is good practice anyway.
A common pattern is to do a try .. finally block around any file handle usage, eg.
try {
sc = new StreamSource(xmlFile);
// check stuff
} finally {
sc.getInputStream().close();
}
// move to the appropriate place
In Java 7, you can instead use the new try with resources block.
Try sc.getInputStream().close() in the catch
All the three answers already given are right : you must close the underlying stream, either with a direct call to StramSource, or getting getting the stream and closing it, or creating the stream yourself and closing it.
However, I've already seen this happening, under windows, since at least three years : even if you close the stream, really every stream, if you try to move or delete the file, it will throw exception .. unless ... you explicitly call System.gc().
However, since System.gc() is not mandatory for a JVM to actually execute a round of garbage collection, and since even if it was the JVM is not mandated to remove all possible garbage object, you have no real way of being sure that the file can be deleted "now".
I don't have a clear explanation, I can only imagine that probably the windows implementation of java.io somehow caches the file handle and does not close it, until the handle gets garbage collected.
It has been reported, but I haven't confirmed it, that java.nio is not subject to this behavior, cause it has more low level control on file descriptors.
A solution I've used in the past, but which is quite a hack, was to :
Put files to delete on a "list"
Have a background thread check that list periodically, calla System.gc and try to delete those files.
Remove from the list the files you managed to delete, and keep there those that are not yet ready to.
Usually the "lag" is in the order of a few milliseconds, with some exceptions of files surviving a bit more.
It could be a good idea to also call deleteOnExit on those files, so that if the JVM terminates before your thread finished cleaning some files, the JVM will try to delete them. However, deleteOnExit had it's own bug at the time, preventing exactly the removal of the file, so I didn't. Maybe today it's resolved and you can trust deleteOnExit.
This is the JRE bug that i find most annoying and stupid, and cannot believe it is still in existence, but unfortunately I hit it just a month ago on windows Vista with latest JRE installed.
Pretty old, but some people may still find this question.
I was using Oracle Java 1.8.0_77.
The problem occurs on Windows, not on Linux.
The StreamSource instanciated with a File seems to automatically allocate and release the file resource when processed by a validator or transformer. (getInputStream() returns null)
On Windows moving a file into the place of the source file (deleting the source file) after the processing is not possible.
Solution/Workaround: Move the file using
Files.move(from.toPath(), to.toPath(), REPLACE_EXISTING, ATOMIC_MOVE);
The use of ATOMIC_MOVE here is the critical point. Whatever the reason ist, it has something to do with the annoying behavior of Windows locking files.

Categories

Resources