maximum limit on Java array

maximum limit on Java array - java

I am trying to create 2D array in Java as follows:
int[][] adjecancy = new int[96295][96295];
but it is failing with the following error:
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2017/04/07 11:58:55 - please wait.
JVMDUMP032I JVM requested System dump using 'C:\eclipse\workspaces\TryJavaProj\core.20170407.115855.7840.0001.dmp' in response to an event
JVMDUMP010I System dump written to C:\eclipse\workspaces\TryJavaProj\core.20170407.115855.7840.0001.dmp
JVMDUMP032I JVM requested Heap dump using 'C:\eclipse\workspaces\TryJavaProj\heapdump.20170407.115855.7840.0002.phd' in response to an event
JVMDUMP010I Heap dump written to C:\eclipse\workspaces\TryJavaProj\heapdump.20170407.115855.7840.0002.phd
A way to solve this is by increasing the JVM memory but I am trying to submit the code for an online coding challenge. There it is also failing and I will not be able to change the settings there.
Is there any standard limit or guidance for creating large arrays which one should not exceed?

int[][] adjecancy = new int[96295][96295];
When you do that you are trying to allocate 96525*96525*32 bits which is nearly 37091 MB which is nearly 37 gigs. That is highly impossible to get the memory from a PC for Java alone.
I don't think you need that much data in your hand on initialization of your program. Probably you have to look at ArrayList which gives you dynamic allocation of size and then keep on freeing up at runtime is a key to consider.
There is no limit or restriction to create an array. As long as you have memory, you can use it. But keep in mind that you should not hold a block of memory which makes JVM life hectic.

Array must obviously fit into memory. If it does not, the typical solutions are:
Do you really need int (max value 2,147,483,647)? Maybe byte (max
value 127) or short is good enough? byte is 8 times smaller than int.
Do you have really many identical values in array (like zeros)? Try to use sparse arrays.
for instance:
Map<Integer, Map<Integer, Integer>> map = new HashMap<>();
map.put(27, new HashMap<Integer, Integer>()); // row 27 exists
map.get(27).put(54, 1); // row 27, column 54 has value 1.
They need more memory per value stored, but have basically no limits on the array space (you can use Long rather than Integer as index to make them really huge).
Maybe you just do not know how long the array should be? Try ArrayList, it self-resizes. Use ArrayList of ArrayLists for 2D array.
If nothing else is helpful, use RandomAccessFile to store your overgrown data into the filesystem. 100 Gb or about are not a problem in these times on a good workstation, you just need to compute the required offset in the file. The filesystem is obviously much slower than RAM but with good SSD drive may be bearable.

It is recommended to allocate Maximum Heap Size that can be allocated is 1/4th of the Machine RAM Size.
1 int in Java takes 4 bytes and your array allocation needs approximately 37.09GB of Memory.
In that case even if I assume you are allocating Full Heap to just an Array your machine should be around 148GB RAM. That is huge.
Have a look at below.
Ref: http://docs.oracle.com/javase/8/docs/technotes/guides/vm/gc-ergonomics.html
Hope this helps.

It depends on maximum memory available to your JVM and the content type of the array. For int we have 4 bytes of memory. Now if 1 MB of memory is available on your machine , it can hold maximum of 1024 * 256 integers(1 MB = 1024 * 1024 bytes). Keeping that in mind you can create your 2D array accordingly.

Array that you can create depends upon JVM heap size.
96295*96295*4(bytes per number) = 37,090,908,100 bytes = ~34.54 GBytes. Most JVMs in competitive code judges don't have that much memory. Hence the error.
To get a good idea of what array size you can use for given heap size -
Run this code snippet with different -Xmx settings:
Scanner scanner = new Scanner(System.in);
while(true){
System.out.println("Enter 2-D array of size: ");
size = scanner.nextInt();
int [][]numbers = new int[size][size];
numbers = null;
}
e.g. with -Xmx 512M -> 2-D array of ~10k+ elements.
Generally most of online judges have ~1.5-2GB heap while evaluating submissions.

Related

What is an overhead for creating Java objects from lines of csv file

the code reads lines of CSV file like:
Stream<String> strings = Files.lines(Paths.get(filePath))
then it maps each line in the mapper:
List<String> tokens = line.split(",");
return new UserModel(tokens.get(0), tokens.get(1), tokens.get(2), tokens.get(3));
and finally collects it:
Set<UserModel> current = currentStream.collect(toSet())
File size is ~500MB
I've connected to the server using jconsole and see that heap size grew from 200MB to 1.8GB while processing.
I can't understand where this x3 memory usage came from - I expected something like 500MB spike or so?
My first impression was it's because there is no throttling and garbage collector simply doesn't have enough time for cleanup.
But I've tried to use guava rate limiter to let garbage collector time to do it's job but result is the same.

Tom Hawtin made good points - I just wanna expand on them and provide a bit more details.
Java Strings take at least 40 bytes of memory (that's for empty string) due to java object header (see later) overhead and an internal byte array.
That means the minimal size for non-empty string (1 or more characters) is 48 bytes.
Nowawadays, JVM uses Compact Strings which means that ASCII-only strings only occupy 1 byte per character - before it was 2 bytes per char minimum.
That means if your file contains characters beyond ASCII set, then memory usage can grow significantly.
Streams also have more overhead compared to plain iteration with arrays/lists (see here Java 8 stream objects significant memory usage)
I guess your UserModel object adds at least 32 bytes overhead on top of each line, because:
the minimum size of java object is 16 bytes where first 12 bytes are the JVM "overhead": object's class reference (4 bytes when Compressed Oops are used) + the Mark word (used for identity hash code, Biased locking, garbage collectors)
and the next 4 bytes are used by the reference to the first "token"
and the next 12 bytes are used by 3 references to the second, third and fourth "token"
and the last 4 bytes are required due to Java Object Alignment at 8-byte boundaries (on 64-bit architectures)
That being said, it's not clear whether you even use all the data that you read from the file - you parse 4 tokens from a line but maybe there are more?
Moreover, you didn't mention how exactly the heap size "grew" - If it was the commited size or the used size of the heap. The used portion is what actually is being "used" by live objects, the commited portion is what has been allocated by the JVM at some point but could be garbage-collected later; used < commited in most cases.
You'd have to take a heap snapshot to find out how much memory actually the result set of UserModel occupies and that would actually be interesting to compare to the size of the file.

It may be that the String implementation is using UTF-16 whereas the file may be using UTF-8. That would be double the size assuming all US ASCII characters. However, I believe JVM tend to use a compact form for Strings nowadays.
Another factor is that Java objects tend to be allocated on a nice round address. That means there's extra padding.
Then there's memory for the actual String object, in addition to the actual data in the backing char[] or byte[].
Then there's your UserModel object. Each object has a header and references are usually 8-bytes (may be 4).
Lastly not all the heap will be allocated. GC runs more efficiently when a fair proportion of the memory isn't, at any particular moment, being used. Even C malloc will end up with much of the memory unused once a process is up and running.

You code reads the full file into memory. Then you start splitting each line into an array, then you create objects of your custom class for each line. So basically you have 3 different pieces of "memory usage" for each line in your file!
While enough memory is available, the jvm might simply not waste time running the garbage collector while turning your 500 megabytes into three different representations. Therefore you are likely to "triplicate" the number of bytes within your file. At least until the gc kicks in and throws away the no longer required file lines and splitted arrays.

File size vs. in memory size in Java

If I take an XML file that is around 2kB on disk and load the contents as a String into memory in Java and then measure the object size it's around 33kB.
Why the huge increase in size?
If I do the same thing in C++ the resulting string object in memory is much closer to the 2kB.
To measure the memory in Java I'm using Instrumentation.
For C++, I take the length of the serialized object (e.g string).

I think there are multiple factors involved.
First of all, as Bruce Martin said, objects in java have an overhead of 16 bytes per object, c++ does not.
Second, Strings in Java might be 2 Bytes per character instead of 1.
Third, it could be that Java reserves more Memory for its Strings than the C++ std::string does.
Please note that these are just ideas where the big difference might come from.

Assuming that your XML file contains mainly ASCII characters and uses an encoding that represents them as single bytes, then you can espect the in memory size to be at least double, since Java uses UTF-16 internally (I've heard of some JVMs that try to optimize this, thouhg). Added to that will be overhead for 2 objects (the String instance and an internal char array) with some fields, IIRC about 40 bytes overall.
So your "object size" of 33kb is definitely not correct, unless you're using a weird JVM. There must be some problem with the method you use to measure it.

In Java String object have some extra data, that increases it's size.
It is object data, array data and some other variables. This can be array reference, offset, length etc.
Visit http://www.javamex.com/tutorials/memory/string_memory_usage.shtml for details.

String: a String's memory growth tracks its internal char array's growth. However, the String class adds another 24 bytes of overhead.
For a nonempty String of size 10 characters or less, the added overhead cost relative to useful payload (2 bytes for each char plus 4 bytes for the length), ranges from 100 to 400 percent.
More:
What is the memory consumption of an object in Java?

Yes, you should GC and give it time to finish. Just System.gc(); and print totalMem() in the loop. You also better to create a million of string copies in array (measure empty array size and, then, filled with strings), to be sure that you measure the size of strings and not other service objects, which may present in your program. String alone cannot take 32 kb. But hierarcy of XML objects can.
Said that, I cannot resist the irony that nobody cares about memory (and cache hits) in the world of Java. We are know that JIT is improving and it can outperform the native C++ code in some cases. So, there is not need to bother about memory optimization. Preliminary optimization is a root of all evils.

As stated in other answers, Java's String is adding an overhead. If you need to store a large number of strings in memory, I suggest you to store them as byte[] instead. Doing so the size in memory should be the same than the size on disk.
String -> byte[] :
String a = "hello";
byte[] aBytes = a.getBytes();
byte[] -> String :
String b = new String(aBytes);

Using too much Ram in Java

I'm writing a program in java which has to make use of a large hash-table, the bigger the hash-table can be, the better (It's a chess program :P). Basically as part of my hash table I have an array of "long[]", an array of "short[]", and two arrays of "byte[]". All of them should be the same size. When I set my table size to ten-million however, it crashes and says "java heap out of memory". This makes no sense to me. Here's how I see it:
1 Long + 1 Short + 2 Bytes = 12 bytes
x 10,000,000 = 120,000,000 bytes
/ 1024 = 117187.5 kB
/ 1024 = 114.4 Mb
Now, 114 Mb of RAM doesn't seem like too much to me. In total my CPU has 4Gb of RAM on my mac, and I have an app called FreeMemory which shows how much RAM I have free and it's around 2Gb while running this program. Also, I set the java preferences like -Xmx1024m, so java should be able to use up to a gig of memory. So why won't it let me allocate just 114Mb?

You predicted that it should use 114 MB and if I run this (on a windows box with 4 GB)
public static void main(String... args) {
long used1 = memoryUsed();
int Hash_TABLE_SIZE = 10000000;
long[] pos = new long[Hash_TABLE_SIZE];
short[] vals = new short[Hash_TABLE_SIZE];
byte[] depths = new byte[Hash_TABLE_SIZE];
byte[] flags = new byte[Hash_TABLE_SIZE];
long used2 = memoryUsed() - used1;
System.out.printf("%,d MB used%n", used2 / 1024 / 1024);
}
private static long memoryUsed() {
return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
}
prints
114 MB used
I suspect you are doing something else which is the cause of your problem.
I am using Oracle HotSpot Java 7 update 10

Has not taken into account that each object is a reference and also use memory, and more "hidden things"... we must also take into account also the alignment... byte is not always a byte ;-)
Java Objects Memory Structure
How much memory is used by Java
To see how much memory is really in use, you can use a profiler:
visualvm
If you are using standard HashMap (or similar from JDK), each "long" (boxing/unboxing) really are more than 8bytes), you can use this as a base... (use less memory)
NativeIntHashMap

From what I have read about BlueJ, and serious technical information is almost impossible to find, BlueJ VM is quite likely not to support primitive types at all; your arrays are actually of boxed primitives. BlueJ uses a subset of all Java features, with emphasis on object orientation.
If that is the case, plus taking into consideration that performance and efficiency are quite low on BlueJ VM's list of priorities, you may actually be using quite a bit more memory than you think: a whole order of magnitude is quite imaginable.

I believe one way it would be to clean the heap memory after each execution, one link is here:
Java heap space out of memory

Huge String Table in Java

I've got a question about storing huge amount of Strings in application memory. I need to load from file and store about 5 millions lines, each of them max 255 chars (urls), but mostly ~50. From time to time i'll need to search one of them. Is it possible to do this app runnable on ~1GB of RAM?
Will
ArrayList <String> list = new ArrayList<String>();
work?
As far as I know String in java is coded in UTF-8, what gives me huge memory use. Is it possible to make such array with String coded in ANSI?
This is console application run with parameters:
java -Xmx1024M -Xms1024M -jar "PServer.jar" nogui

The latest JVMs support -XX:+UseCompressedStrings by default which stores strings which only use ASCII as a byte[] internally.
Having several GB of text in a List isn't a problem, but it can take a while to load from disk (many seconds)
If the average URL is 50 chars which are ASCII, with 32 bytes of overhead per String, 5 M entries could use about 400 MB which isn't much for a modern PC or server.

A Java String is a full blown object. This means that appart from the characters of the string theirselves, there is other information to store in it (a pointer to the class of the object, a counter with the number of pointers pointing to it, and some other infrastructure data). So an empty String already takes 45 bytes in memory (as you can see here).
Now you just have to add the maximum lenght of your string and make some easy calculations to get the maximum memory of that list.
Anyway, I would suggest you to load the string as byte[] if you have memory issues. That way you can control the encoding and you can still do searchs.

Is there some reason you need to restrict it to 1G? If you want to search through them, you definitely don't want to swap to disk, but if the machine has more memory it makes sense to go higher then 1G.
If you have to search, use a SortedSet, not an ArrayList

Copying a java text file into a String

I run into the following errors when i try to store a large file into a string.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
at java.lang.StringBuffer.append(StringBuffer.java:306)
at rdr2str.ReaderToString.main(ReaderToString.java:52)
As is evident, i am running out of heap space. Basically my pgm looks like something like this.
FileReader fr = new FileReader(<filepath>);
sb = new StringBuffer();
char[] b = new char[BLKSIZ];
while ((n = fr.read(b)) > 0)
sb.append(b, 0, n);
fileString = sb.toString();
Can someone suggest me why i am running into heap space error? Thanks.

You are running out of memory because the way you've written your program, it requires storing the entire, arbitrarily large file in memory. You have 2 options:
You can increase the memory by passing command line switches to the JVM:
java -Xms<initial heap size> -Xmx<maximum heap size>
You can rewrite your logic so that it deals with the file data as it streams in, thereby keeping your program's memory footprint low.
I recommend the second option. It's more work but it's the right way to go.
EDIT: To determine your system's defaults for initial and max heap size, you can use this code snippet (which I stole from a JavaRanch thread):
public class HeapSize {
public static void main(String[] args){
long kb = 1024;
long heapSize = Runtime.getRuntime().totalMemory();
long maxHeapSize = Runtime.getRuntime().maxMemory();
System.out.println("Heap Size (KB): " + heapSize/1024);
System.out.println("Max Heap Size (KB): " + maxHeapSize/1024);
}
}

You allocate a small StringBuffer that gets longer and longer. Preallocate according to file size, and you will also be a LOT faster.
Note that java is Unicode, the string likely not, so you use... twice the size in memory.
Depending on VM (32 bit? 64 bit?) and the limits set (http://www.devx.com/tips/Tip/14688) you may simply not have enough memory available. How large is the file actually?

In the OP, your program is aborting while the StringBuffer is being expanded. You should preallocate that to the size you need or at least close to it. When StringBuffer must expand it needs RAM for the original capacity and the new capacity. As TomTom said too, your file is likely 8-bit characters so will be converted to 16-bit unicode in memory so it will double in size.
The program has not even encountered yet the next doubling - that is StringBuffer.toString() in Java 6 will allocate a new String and the internal char[] will be copied again (in some earlier versions of Java this was not the case). At the time of this copy you will need double the heap space - so at that moment at least 4 times what your actual files size is (30MB * 2 for byte->unicode, then 60MB * 2 for toString() call = 120MB). Once this method is finished GC will clean up the temporary classes.
If you cannot increase the heap space for your program you will have some difficulty. You cannot take the "easy" route and just return a String. You can try to do this incrementally so that you do not need to worry about the file size (one of the best solutions).
Look at your web service code in the client. It may provide a way to use a different class other than String - perhaps a java.io.Reader, java.lang.CharSequence, or a special interface, like the SAX related org.xml.sax.InputSource. Each of these can be used to build an implementation class that reads from your file in chunks as the callers needs it instead of loading the whole file at once.
For instance, if your web service handling routes can take a CharSequence then (if they are written well) you can create a special handler to return just one character at a time from the file - but buffer the input. See this similar question: How to deal with big strings and limited memory.

Kris has the answer to your problem.
You could also look at java commons fileutils' readFileToString which may be a bit more efficient.

Although this might not solve your problem, some small things you can do to make your code a bit better:
create your StringBuffer with an initial capacity the size of the String you are reading
close your filereader at the end: fr.close();

By default, Java starts with a very small maximum heap (64M on Windows at least). Is it possible you are trying to read a file that is too large?
If so you can increase the heap with the JVM parameter -Xmx256M (to set maximum heap to 256 MB)
I tried running a slightly modified version of your code:
public static void main(String[] args) throws Exception{
FileReader fr = new FileReader("<filepath>");
StringBuffer sb = new StringBuffer();
char[] b = new char[1000];
int n = 0;
while ((n = fr.read(b)) > 0)
sb.append(b, 0, n);
String fileString = sb.toString();
System.out.println(fileString);
}
on a small file (2 KB) and it worked as expected. You will need to set the JVM parameter.

Trying to read an arbitrarily large file into main memory in an application is bad design. Period. No amount of JVM settings adjustments/etc... are going to fix the core issue here. I recommend that you take a break and do some googling and reading about how to process streams in java - here's a good tutorial and here's another good tutorial to get you started.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

maximum limit on Java array - java

Related

What is an overhead for creating Java objects from lines of csv file

File size vs. in memory size in Java

Using too much Ram in Java

Huge String Table in Java

Copying a java text file into a String

Categories

Resources