Java VM suddenly exiting without apparent reason

Java VM suddenly exiting without apparent reason - java

I have a problem with my Java progam suddenly exiting, without any exception thrown or the program finishing normally.
I'm writing a program to solve Project Euler's 14th problem. This is what I got:
private static final int INITIAL_CACHE_SIZE = 30000;
private static Map<Long, Integer> cache = new HashMap<Long, Integer>(INITIAL_CACHE_SIZE);
public void main(String... args) {
long number = 0;
int maxSize = 0;
for (long i = 1; i <= TARGET; i++) {
int size = size(i);
if (size > maxSize) {
maxSize = size;
number = i;
}
}
}
private static int size(long i) {
if (i == 1L) {
return 1;
}
final int size = size(process(i)) + 1;
return size;
}
private static long process(long n) {
return n % 2 == 0 ? n/2 : 3*n + 1;
}
This runs fine, and finishes correctly in about 5 seconds when using a TARGET of 1 000 000.
I wanted to optimize by adding a cache, so I changed the size method to this:
private static int size(long i) {
if (i == 1L) {
return 1;
}
if (cache.containsKey(i)) {
return cache.get(i);
}
final int size = size(process(i)) + 1;
cache.put(i, size);
return size;
}
Now when I run it, it simply stops (process exits) when I get to 555144. Same number every time. No exception, error, Java VM crash or anything is thrown.
Changing the cache size doesn't seem to have any effect either, so how could the cache
introduction cause this error?
If I enforce the cache size to be not just initial, but permanent like so:
if (i < CACHE_SIZE) {
cache.put(i, size);
}
the bug no longer occurs.
Edit: When I set the cache size to like 2M, the bug starts showing again.
Can anyone reproduce this, and maybe even provide a suggestion as to why it happens?

This is simply an OutOfMemoryError that is not being printed. The program runs fine if I set a high heap size, otherwise it exits with an unlogged OutOfMemoryError (easy to see in a Debugger, though).
You can verify this and get a heap dump (as well as printout that an OutOfMemoryError occurred) by passing this JVM arg and re-running your program:
-XX:+HeapDumpOnOutOfMemoryError
With this it will then print out something to this effect:
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid4192.hprof ...
Heap dump file created [91901809 bytes in 4.464 secs]
Bump up your heap size with, say, -Xmx200m and you won't have an issue - At least for TARGET=1000000.

It sounds like the JVM itself crashes (that is the first thought when your program dies without a hint of an exception anyway). The first step in such a problem is to upgrade to the latest revision for your platform. The JVM should dump the heap to a .log file in the directory where you started the JVM, assuming your user level has access rights to that directory.
That being said, some OutOfMemory errors don't report in the main thread, so unless you do a try/catch (Throwable t) and see if you get one, it is hard to be sure you aren't actually just running out of memory. The fact that it only uses 100MB could just mean that the JVM isn't configured to use more. That can be changed by changing the startup options to the JVM to -Xmx1024m to get a Gig of memory, to see if the problem goes anywhere.
The code for doing the try catch should be something like this:
public static void main(String[] args) {
try {
MyObject o = new MyObject();
o.process();
} catch (Throwable t) {
t.printStackTrace();
}
}
And do everything in the process method and do not store your cache in statics, that way if the error happens at the catch statement the object is out of scope and can be garbage collected, freeing enough memory to allow the printing of the stack trace. No guarantees that that works, but it gives it a better shot.

One significant difference between the two implmentations of size(long i) is in the amount of objects you are creating.
In the first implementation, there are no Objects being created. In the second you are doing an awful lot of autoboxing, creating a new Long for each access of your cache, and putting in new Longs and new Integers on each modification.
This would explain the increase in memory usage, but not the absence of an OutOfMemoryError. Increasing the heap does allows it to complete for me.
From this Sun aritcle:
The performance ... is likely to be poor, as it boxes or unboxes on every get or set operation. It is plenty fast enough for occasional use, but it would be folly to use it in a performance critical inner loop.

If your java process suddenly crashes it could be some resource got maxed out. Like memory. You could try setting a higher max heap

Do you see a Heap Dump being generated after the crash? This file should be in the current directory for your JVM, that's where I would look for more info.

I am getting an OutOfMemory error on cache.put(i, size);
To get the error run your program in eclipse using debug mode it will appear in the debug window. It does not produce a stack trace in the console.

The recursive size() method is probably not a good place to do the caching. I put a call to cache.put(i, size); inside the main()'s for-loop and it works much more quickly. Otherwise, I also get an OOM error (no more heap space).
Edit: Here's the source - the cache retrieval is in size(), but the storing is done in main().
public static void main(String[] args) {
long num = 0;
int maxSize = 0;
long start = new Date().getTime();
for (long i = 1; i <= TARGET; i++) {
int size = size(i);
if (size >= maxSize) {
maxSize = size;
num = i;
}
cache.put(i, size);
}
long computeTime = new Date().getTime() - start;
System.out.println(String.format("maxSize: %4d on initial starting number %6d", maxSize, num));
System.out.println("compute time in milliseconds: " + computeTime);
}
private static int size(long i) {
if (i == 1l) {
return 1;
}
if (cache.containsKey(i)) {
return cache.get(i);
}
return size(process(i)) + 1;
}
Note that by removing the cache.put() call from size(), it does not cache every computed size, but it also avoids re-caching a previously computed size. This does not affect the hashmap operations, but like akf points out, it avoids the autoboxing/unboxing operations which is where your heap killer is coming from. I also tried a "if (!containsKey(i)) { cache.put() etc" in size() but that unfortunately also runs out of memory.

Related

Java -Xss6G argument not preventing StackOverflowError, no RAM used

The java program in question is highly recursive (the infamous Ackermann function) and will call itself billions of times. It does work in c but not in java, even when using -Xss6G to add stack size and using -Xsx6G (or similiar). While that might just not be enough space, the RAM is actually not even used as visible in the Ubuntu system manager and when there are about 23 thousand instances (no matter the allocated memory) of the method present at a time java just gives up. What is the cause of this / why is the Overflow still happening?
ack(4,1)
public static int ack (int m, int n) {
if(recs % 131072 == 0) System.out.println("Function call number "+recs);//not the code used for the 23 thousand-number. Just for tracking how fast the code operates
recs++;
int ans;
if(m==0) ans = n+1;
else if(n == 0) ans = ack(m-1,1);
else ans = ack(m-1, ack(m,n-1));
return ans;
}

Java uses more memory than anticipated

Ok, so I try to do this little experiment in java. I want to fill up a queue with integers and see how long it takes. Here goes:
import java.io.*;
import java.util.*;
class javaQueueTest {
public static void main(String args[]){
System.out.println("Hello World!");
long startTime = System.currentTimeMillis();
int i;
int N = 50000000;
ArrayDeque<Integer> Q = new ArrayDeque<Integer>(N);
for (i = 0;i < N; i = i+1){
Q.add(i);
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println(totalTime);
}
}
OK, so I run this and get a
Hello World!
12396
About 12 secs, not bad for 50 million integers. But if I try to run it for 70 million integers I get:
Hello World!
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.Integer.valueOf(Integer.java:642)
at javaQueueTest.main(javaQueueTest.java:14)
I also notice that it takes about 10 mins to come up with this message. Hmm so what if I give almost all my memory (8gigs) for the heap? So I run it for heap size of 7gigs but I still get the same error:
javac javaQueueTest.java
java -cp . javaQueueTest -Xmx7g
Hello World!
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.Integer.valueOf(Integer.java:642)
at javaQueueTest.main(javaQueueTest.java:14)
I want to ask two things. First, why does it take so long to come up with the error? Second, Why is all this memory not enough? If I run the same experiment for 300 million integers in C (with the glib g_queue) it will run (and in 10 secs no less! although it will slow down the computer alot) so the number of integers must not be at fault here. For the record, here is the C code:
#include<stdlib.h>
#include<stdio.h>
#include<math.h>
#include<glib.h>
#include<time.h>
int main(){
clock_t begin,end;
double time_spent;
GQueue *Q;
begin = clock();
Q = g_queue_new();
g_queue_init(Q);
int N = 300000000;
int i;
for (i = 0; i < N; i = i+1){
g_queue_push_tail(Q,GINT_TO_POINTER(i));
}
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("elapsed time: %f \n",time_spent);
}
I compile and get the result:
gcc cQueueTest.c `pkg-config --cflags --libs glib-2.0 gsl ` -o cQueueTest
~/Desktop/Software Development/Tests $ ./cQueueTest
elapsed time: 13.340000

My rough thoughts about your questions:
First, why does it take so long to come up with the error?
As gimpycpu in his comment stated, java does not start with full memory acquisition of your RAM. If you want so (and you have a 64 bit VM for greater amount of RAM), you can add the options -Xmx8g and -Xms8g at VM startup time to ensure that the VM gots 8 gigabyte of RAM and the -Xms means that it will also prepare the RAM for usage instead of just saying that it can use it. This will reduce the runtime significantly. Also as already mentioned, Java integer boxing is quite overhead.
Why is all this memory not enough?
Java introduces for every object a little bit of memory overhead, because the JVM uses Integer references in the ArrayDeque datastructur in comparision to just 4 byte plain integers due to boxing. So you have to calulate about 20 byte for every integer.
You can try to use an int[] instead of the ArrayDeque:
import java.io.*;
import java.util.*;
class javaQueueTest {
public static void main(args){
System.out.println("Hello World!");
long startTime = System.currentTimeMillis();
int i;
int N = 50000000;
int[] a = new int[N];
for (i = 0;i < N; i = i+1){
a[i] = 0;
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println(totalTime);
}
}
This will be ultra fast and due the usage of plain arrays.
On my system I am under one second for every run!

In your case, the GC struggles as it assumes that at least some objects will be short lived. In your case all objects are long lived, this adds a significant overhead to managing this data.
If you use -Xmx7g -Xms7g -verbose:gc and N = 150000000 you get an output like
Hello World!
[GC (Allocation Failure) 1835008K->1615280K(7034368K), 3.8370127 secs]
5327
int is a primitive in Java (4 -bytes), while Integer is the wrapper. This wrapper need a reference to it and a header and padding and the result is that an Integer and its reference uses 20 bytes per value.
The solution is to not queue up some many values at once. You can use a Supplier to provide new values on demand, avoiding the need to create the queue in the first place.
Even so, with 7 GB heap you should be able to create a ArrayQueue of 200 M or more.

First, why does it take so long to come up with the error?
This looks like a classic example of a GC "death spiral". Basically what happens is that the JVM does full GCs repeatedly, reclaiming less and less space each time. Towards the end, the JVM spends more time running the GC than doing "useful" work. Finally it gives up.
If you are experiencing this, the solution is to configure a GC Overhead Limit as described here:
GC overhead limit exceeded
(Java 8 configures a GC overhead limit by default. But you are apparently using an older version of Java ... judging from the exception message.)
Second, Why is all this memory not enough?
See #Peter Lawrey's explanation.
The workaround is to find or implement a queue class that doesn't use generics. Unfortunately, that class will not be compatible with the standard Deque API.

You can catch OutOfMemoryError with :
try{
ArrayDeque<Integer> Q = new ArrayDeque<Integer>(N);
for (i = 0;i < N; i = i+1){
Q.add(i);
}
}
catch(OutOfMemoryError e){
Q=null;
System.gc();
System.err.println("OutOfMemoryError: "+i);
}
in order to show when the OutOfMemoryError is thrown.
And launch your code with :
java -Xmx4G javaQueueTest
in order to increase heap size for JVM
As mentionned earlier, Java is much slower with Objects than C with primitive types ...

Create a task that takes up a specific amount of CPU and RAM

I'm testing my company software project and would like to see how it works under heavy load conditions.
Is there anyway to create a task that takes up a big amount of CPU and only stops if I tell it to?
If it's not programmatically possible, what are other options? E.g. what software and input out there can quickly help me create such condition?
Thanks in advance.

This could be done by program.
For consuming CPU:
A simple dead loop won't consume all the CPUs, because your CPU probably have multiple logical cores, so you need create multiple threads to do it. Here is the code:
DWORD WINAPI ConsumeSingleCore(LPVOID lpThreadParameter)
{
DWORD_PTR mask = 1 << (int) lpThreadParameter;
::SetThreadAffinityMask(::GetCurrentThread(), mask);
for (;;) {}
}
void ConsumeAllCores()
{
SYSTEM_INFO systemInfo = { 0 };
::GetSystemInfo(&systemInfo);
for (DWORD i = 0; i < systemInfo.dwNumberOfProcessors; ++i)
{
::CreateThread(NULL, 0, ConsumeSingleCore, (LPVOID)i, 0, NULL);
}
}
For consuming memory:
Allocating enough objects on the heap will be helpful, although it is not very accurate, because there will be some overhead caused by internal structures in system, like heap. If you need accurate number. I think using virtual memory directly will be a good choice. Here is the code:
void ConsumeRAM()
{
SYSTEM_INFO systemInfo = { 0 };
::GetSystemInfo(&systemInfo);
DWORD memSize = 1024 * 1024 * 1024;
char *buffer = (char *)::VirtualAlloc(NULL, memSize, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
// Touch all the pages, so system will try to allocate physical memory for them.
for (DWORD memAddrOffset = 0; memAddrOffset < memSize; memAddrOffset += systemInfo.dwPageSize)
{
buffer[memAddrOffset] = 0;
}
return;
}
And if you just need some tools to test, you could try CPU overload for consuming certain number of cores and MemAlloc for consuming certain number of memory.

Any simple loop will consume CPU, typically processes yield when they are waiting for the OS to do something, read disk, access network, allocate memory.
int quiteBig = 20000;
int a[quiteBig] ;
while ( true ) { // or check for a terminating condition
for ( long i = 0; i < quiteBig ; i++ ) {
a[i] = i;
}
}

In C#:
For RAM
List<anyObject> hogger = new List<anyObject>();
for(long i = 0; i < someHugeNumber; i++)
hogger.Add(new anyObject());
Basically try and see how many objects you need to take up the space you want.
For CPU
while(true)
;
That is an instant 100% CPU
If you want a bit less
while(true)
{
for(long i = 0; i < someNumber; i++)
{
;
}
Thread.Sleep(1);
}
with some number you can adjust the number of mad cycles before the break
=> Fiddle with the number for averaging down CPU usage
As termination for the while-loops you can use a button on a form or a key-combination or something.

Yes it is programmatically possible, as others have shown examples on how to load the CPU. Note that the CPU load generally only takes one core so you will need have a copy of the program running for each core (assign a script to each core and specify which core it will run on)
You could also look into https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing

How to determine in java when I'm near to the OutOfMemoryError?

I need to find out when I'm really close to the OutOfMemoryError so I can flush results to file and call runtime.gc();. My code is something like this:
Runtime runtime = Runtime.getRuntime();
...
if ((1.0 * runtime.totalMemory() / runtime.maxMemory()) > 0.9) {
... flush results to file ...
runtime.gc();
}
Is there a better way to do this? Can someone give me a hand please?
EDIT
I understood that I am playing with fire this way so I reasoned to a more solid and simple way of determining when I've had enough. I am currently working with the Jena model so I do a simple check: if the model has more than 550k statements then I flush so I don't run any risks.

First: if you want to determine if you're close to OutOfMemoryError, then what all you have to do is to compare the current memory with the max memory used by JVM, and that what you already did.
Second: You want to flush results to file, am wondering why you want to do that just if you close to OutOfMemoryError, you simply can use something like a FileWriter which has a buffer, so if the buffer got filled it will flush the results automatically.
Third: don't ever call the GC explicitly, its a bad practice, optimize your JVM memory arguments instead:
-Xmx -> this param to set the max memory that the JVM can allocate
-Xms -> the init memory that JVM will allocate on the start up
-XX:MaxPermSize= -> this for the max Permanent Generation memory
Also
-XX:MaxNewSize= -> this need to be 40% from your Xmx value
-XX:NewSize= -> this need to be 40% from your Xmx value
These will speed up the GC.
And -XX:+UseConcMarkSweepGC to enable using CMS for the old space.

This seems to work:
public class LowMemoryDetector {
// Use a soft reference to some memory - will be held onto until GC is nearly out of memory.
private final SoftReference<byte[]> buffer;
// The queue that watches for the buffer to be discarded.
private final ReferenceQueue<byte[]> queue = new ReferenceQueue<>();
// Have we seen the low condition?
private boolean seenLow = false;
public LowMemoryDetector(int bufferSize) {
// Make my buffer and add register the queue for it to be discarded to.
buffer = new SoftReference(new byte[bufferSize], queue);
}
/**
* Please be sure to create a new LMD after it returns true.
*
* #return true if a memory low condition has been detected.
*/
public boolean low () {
// Preserve that fact that we've seen a low.
seenLow |= queue.poll() != null;
return seenLow;
}
}
private static final int OneMeg = 0x100000;
public void test() {
LowMemoryDetector lmd = new LowMemoryDetector(2*OneMeg);
ArrayList<char[]> eatMemory = new ArrayList<>();
int ate = 0;
while ( !lmd.low() ) {
eatMemory.add(new char[OneMeg]);
ate += 1;
}
// Let it go.
eatMemory = null;
System.out.println("Ate "+ate);
}
it prints
Ate 1070
for me.
Use a buffer size of something larger than the largest allocation unit you are using. It needs to be big enough so that any allocation request would be satisfied if the buffer was freed.
Please remember that on a 64bit JVM it is potentially possible that you are running with many tb of memory. This approach would almost certainly encounter many difficulties in this case.

What is the difference between Java 6 and 7 that would cause a performance issue?

My general experience with Java 7 tells me that it is faster than Java 6. However, I've run into enough information that makes me believe that this is not always the case.
The first bit of information comes from Minecraft Snooper data found here. My intention was to look at that data to determine the effects of the different switches used to launch Minecraft. For example I wanted to know if using -Xmx4096m had a negative or positive effect on performance. Before I could get there I looked at the different version of Java being used. It covers everything from 1.5 to a developer using 1.8. In general as you increase the java version you see an increase in fps performance. Throughout the different versions of 1.6 you even see this gradual trend up. I honestly wasn't expecting to see as many different versions of java still in the wild but I guess people don't run the updates like they should.
Some time around the later versions of 1.6 you get the highest peeks. 1.7 performs about 10fps on average below the later versions of 1.6 but still higher than the early versions of 1.6. On a sample from my own system it's almost impossible to see the difference but when looking at the broader sample it's clear.
To control for the possibility that someone might have found a magic switch for Java I control with by only looking at the data with No switches being passed. That way I'd have a reasonable control before I started looking at the different flags.
I dismissed most of what I was seeing as this could be some Magic Java 6 that someone's just not sharing with me.
Now I've been working on another project that requires me to pass an array in an InputStream to be processed by another API. Initially I used a ByteArrayInputStream because it would work out of the box. When I looked at the code for it I noticed that every function was synchronized. Since this was unnecessary for this project I rewrote one with the synchronization stripped out. I then decided that I wanted to know what the general cost of Synchronization was for me in this situation.
I mocked up a simple test just to see. I timed everything in with System.nanoTime() and used Java 1.6_20 x86 and 1.7.0-b147 AMD64, and 1.7_15 AMD64 and using the -server. I expected the AMD64 version to outperform based on architecture alone and have any java 7 advantages. I also looked at the 25th, 50th, and 75th percentile (blue,red,green). However 1.6 with no -server beat the pants off of every other configuration.
So my question is.
What is in the 1.6 -server option that is impacting performance that is also defaulted to on in 1.7?
I know most of the speed enhancement in 1.7 came from defaulting some of the more radical performance options in 1.6 to on, but one of them is causing a performance difference. I just don't know which ones to look at.
public class ByteInputStream extends InputStream {
public static void main(String args[]) throws IOException {
String song = "This is the song that never ends";
byte[] data = song.getBytes();
byte[] read = new byte[data.length];
ByteArrayInputStream bais = new ByteArrayInputStream(data);
ByteInputStream bis = new ByteInputStream(data);
long startTime, endTime;
for (int i = 0; i < 10; i++) {
/*code for ByteInputStream*/
/*
startTime = System.nanoTime();
for (int ctr = 0; ctr < 1000; ctr++) {
bis.mark(0);
bis.read(read);
bis.reset();
}
endTime = System.nanoTime();
System.out.println(endTime - startTime);
*/
/*code for ByteArrayInputStream*/
startTime = System.nanoTime();
for (int ctr = 0; ctr < 1000; ctr++) {
bais.mark(0);
bais.read(read);
bais.reset();
}
endTime = System.nanoTime();
System.out.println(endTime - startTime);
}
}
private final byte[] array;
private int pos;
private int min;
private int max;
private int mark;
public ByteInputStream(byte[] array) {
this(array, 0, array.length);
}
public ByteInputStream(byte[] array, int offset, int length) {
min = offset;
max = offset + length;
this.array = array;
pos = offset;
}
#Override
public int available() {
return max - pos;
}
#Override
public boolean markSupported() {
return true;
}
#Override
public void mark(int limit) {
mark = pos;
}
#Override
public void reset() {
pos = mark;
}
#Override
public long skip(long n) {
pos += n;
if (pos > max) {
pos = max;
}
return pos;
}
#Override
public int read() throws IOException {
if (pos >= max) {
return -1;
}
return array[pos++] & 0xFF;
}
#Override
public int read(byte b[], int off, int len) {
if (pos >= max) {
return -1;
}
if (pos + len > max) {
len = max - pos;
}
if (len <= 0) {
return 0;
}
System.arraycopy(array, pos, b, off, len);
pos += len;
return len;
}
#Override
public void close() throws IOException {
}
}// end class

I think, as the others are saying, that your tests are too short to see the core issues - the graph is showing nanoTime, and that implies the core section being measured completes in 0.0001 to 0.0006s.
Discussion
The key difference in -server and -client is that -server expects the JVM to be around for a long time and therefore expends effort early on for better long-term results. -client aims for fast startup times and good-enough performance.
In particular hotspot runs with more optimizations, and these take more CPU to execute. In other words, with -server, you may be seeing the cost of the optimizer outweighing any gains from the optimization.
See Real differences between "java -server" and "java -client"?
Alternatively, you may also be seeing the effects of tiered compilation where, in Java 7, hotspot doesn't kick in so fast. With only 1000 iterations, the full optimization of your code won't be done until later, and the benefits will therefore be lesser.
You might get insight if you run java with the -Xprof option the JVM will dump some data about the time spent in various methods, both interpreted and compiled. It should give an idea about what was compiled, and the ratio of (cpu) time before hotspot kicked in.
However, to get a true picture, you really need to run this much longer - secondsminutes, not milliseconds - to allow Java and the OS to warm up. It would be even better to loop the test in main (so you have a loop containing your instrumented main test loop) so that you can ignore the warm-up.
EDIT Changed seconds to minutes to ensure that hotspot, the jvm and the OS are properly 'warmed up'

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.