What would be the memory size/space occupied in bits/bytes by array as follows.
final String[] objects_user1={"1","10","100","1000","10000"};
ROUGH ESTIMATE: 12 bytes for array header, 4x5 bytes for the pointers (8x5 if you're on a 64 bit jvm), each string has 3 ints (+3x4 bytes), and an array of chars (+12 bytes for header + length of the string x2, because it's char).
Did you try to Google it? Here is the first result of my Google search.
Impossible to say, since its an implementation detail of the JRE you're using.
You can get an approximate answer by querying available heap space before & after the memory allocation. Run it a number of times & compute the average, & it will be pretty close to the right answer. But again, the answer is only valid for the specific JVM it's run on.
Related
During the run of my program i create a lot of String(1.000.000) up to size of 700 and my program eats up a lot of memory.These Strings can contain only R,D,L,U as chars so i thought that i could represent them differently.I thought about using BitSet but i am not sure it is more memory efficient.Any ideas?
P.S:i could also shrink the String compressing equal chars(RRRRRRDDDD->R6D4) but i was hoping for a better solution.
as a first step, you could try to switch to char[]. Java String takes approx 40 bytes more than the sum of its characters (source) and char[] is considerably more convenient than bit arithmetic
even more economical is byte[] since one char requires two bytes allocation, while a byte is, of course, one byte (and still has room for 256 distinct values)
As the BitSet.get() function uses an int as an argument, I was thinking whether I could store more than 2^32 bits in a BitSet, and if so how would I retrieve them?
I am doing a Project Euler problem where I need to generate primes till 10^10. The algorithm I'm currently using to generate primes is the Erathonesus' Sieve, storing the boolean values as bits in a BitSet. Any workaround for this?
You could use a list of bitsets as List<BitSet> and when the end of one bitset has been reached you could move to the next one.
However, I think your approach is probably incorrect. Even if you use a single bit for each number you need 10^10 bits which is about 1 GB memory (8 bits in a byte and 1024^3 bytes in a GB). Most Project Euler problems should be solvable without needing that much memory.
No, it's limited by the int indexing in its interface.
So they didn't bother exploiting all its potential, (about 64x downsized) probably because it wasn't feasible to use that much RAM.
I worked on a LongBitSet implementation, published it here.
It can take:
//2,147,483,647 For reference, this is Integer.MAX_VALUE
137,438,953,216 LongBitSet max size (bits)
0b1111111_11111111_11111111_11111100_000000L in binary
Had to address corner cases, in the commit history you can see the 1st commit being a copy paste from java.util.BitSet.
See factory method:
public static LongBitSet getMaxSizeInstance() {
// Integer.MAX_VALUE - 3 << ADDRESS_BITS_PER_WORD
return new LongBitSet( 0b1111111_11111111_11111111_11111100_000000L);
}
Note: -Xmx24G -Xms24G -ea Min GB needed to start JVM with to call getMaxSizeInstance() without java.lang.OutOfMemoryError: Java heap space
I am working on a small task where I am required to store around 1 billion integers in an Array. However, I am running into a heap space problem. Could you please help me with this?
Machine Details : Core 2 Duo Processor with 4 GB RAM. I have even tried -Xmx 3072m . Is there any work around for this?
The same thing works in C++ , so there should definitely be a way to store this many numbers in memory.
Below is the code and the exception I am getting :
public class test {
private static int C[] = new int[10000*10000];
public static void main(String[] args) {
System.out.println(java.lang.Runtime.getRuntime().maxMemory());
}
}
Exception :
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at test.(test.java:3)
Use an associative array. The key is an integer, and the value is the count (the number of times the integer has been added to the list).
This should get you some decent space savings if the distribution is relatively random, much more so if it's not.
If you need to store 1 billion completely random integers then I am afraid that you really do need to corresponding space, i.e. about 4GB of memory for 32-bit int numbers. You can try increasing the JVM heap space but you need to have a 64-bit OS and at least as much physical memory - and there is only so far that you can go.
On the other hand, you might be able to store those number more efficiently if you can make use of specific constraints within your application.
E.g. if you only need to know if a specific int is contained in a set, you could get away with a bit set - i.e. a single bit for each value in the int range. That is about 4 billion bits, i.e. 512 MB - a far more reasonable space requirement. For example, a handful of BitSet objects could cover the whole 32-bit integer range without you having to write any bit-handling code...
May be using memory mapped files will help? They are not allocated from the heap.
Here is an article how to create a matrix. An array should be easier.
Using a memory mapped file for a huge matrix - Peter Lawrey
You can increase to 4GB on a 32 bit system.
If you're on a 64 bit system you can go higher.
Type in cmd this
java -Xmx4g programname
As array so big may not fit into your RAM, you need to configure the sufficient HDD swap space. 4 - 16 Gb on swap do not look like something unrealistic at these times.
Java only allows to use int as an index of array, not long. Hence the largest possible array could have 2147483648 values, enough.
Use -Xmx to raise the memory ceiling that by default will probably be insufficient. 3072m is not enough as one billion ints requires about 4 Gb. As space is needed also for the operating system and the like, that machine with 4 Gb RAM cannot hold all 4 Gb data structure in memory.
JRE or OS may also refuse to grant an piece of memory so big in one go, requiring to allocate in some smaller chunks (maybe array or arrays).
Is there a way to get an array in java which is longer than the length supported by an integer data type?
I'm looking for something that may be indexable using a big integer in Java because the natively supported array length is nowhere near as big as I need it to be for an algorithm I am implementing.
This library should be useful for you: http://fastutil.dsi.unimi.it/
It says:
"...provides also big (64-bit) arrays..."
Int32 gives you 17 8 gigabytes of storage. Do you have so much memory?
I think you should use sparse arrays, i.e. index elements by hash. For example, with just HashMap<BigInteger,YourValueType> or with some libs like BigMemory and alternatives http://terracotta.org/products/bigmemory
Are you sure that you don't have to change the algorithm? Integer.MAX is equal to 2^31-1, which is 2147483647, each int has 4 bytes what gives us: 8589934588 bytes of memory (8GB!!!).
If I create 10 integers and an integer array of 10, will there be any difference in total space occupied?
I have to create a boolean array of millions of records, so I want to understand how much space will be taken by array itself.
An array of integers is represented as block of memory to hold the integers, and an object header. The object header typically takes 3 32bit words for a 32 bit JVM, but this is platform dependent. (The header contains some flag bits, a reference to a class descriptor, space for primitive lock information, and the length of the actual array. Plus padding.)
So an array of 10 ints probably takes in the region of 13 * 4 bytes.
In the case on an Integer[], each Integer object has a 2 word header and a 1 word field containing the actual value. And you also need to add in padding, and 1 word (or 1 to 2 words on a 64-bit JVM) for the reference. That is typically 5 words or 20 bytes per element of the array ... unless some Integer objects appear in multiple places in the array.
Notes:
The number of words actually used for a reference on a 64 bit JVM depends on whether "compressed oops" are used.
On some JVMs, heap nodes are allocated in multiples of 16 bytes ... which inflates space usage (e.g. the padding mentioned above).
If you take the identity hashcode of an object and it survives the next garbage collection, its size gets inflated by at least 4 bytes to cache the hashcode value.
These numbers are all version and vendor specific, in addition to the sources of variability enumerated above.
Some rough lower bounds calculations:
Each int takes up four bytes. = 40 bytes for ten
An int array takes up four bytes for each component plus four bytes to store the length plus another four bytes to store the reference to it. = 48 bytes (+ maybe some padding to align all objects at 8 byte boundaries)
An Integer takes up at least 8 bytes, plus the another four bytes to store the reference to it. = at least 120 for ten
An Integer array takes up at least the 120 bytes for the ten Integers plus four bytes for the length, and then maybe some padding for alignment. Plus four bytes to store the reference to it. (#Marko reports that he even measured about 28 bytes per slot, so that would be 280 bytes for an array of ten).
In java you have both Integer and int. Supposing you are referring to int , an array of ints is considered an object and objects have metadata so an array of 10 ints will occupy more than 10 int variables
What you can do is measure:
public static void main(String[] args) {
final long startMem = measure();
final boolean[] bs = new boolean[1000000];
System.out.println(measure() - startMem);
bs.hashCode();
}
private static long measure() {
final Runtime rt = Runtime.getRuntime();
rt.gc();
try { Thread.sleep(20); } catch (InterruptedException e) {}
rt.gc();
return rt.totalMemory() - rt.freeMemory();
}
Of course, this goes with the standard disclaimer: gc() has no particular guarantees, so repeat several times to see if you are getting consistent results. On my machine the answer is one byte per boolean.
In light of your comment it will not make much difference if you used an array. Array will use a negligible amount of memory for its functionality itself. All other memory will be used by the stored objects.
EDIT: What you need to understand is that the difference between Boolean wrapper and boolean primitive type. Wrapper types will usually take up more space than the primitives. So for missions of records try to go with the primitives.
Another thing to keep in mind when dealing of missions of record as you said is Java Autoboxing. The performance hit can be significant if you unintentionally use this in a function that traverses the whole array.
It needn't reflect poorly on the teacher / interviewer.
How much you care about the size and alignment of variables in memory depends on how performant you need your code to be. It matters a lot if your software processes transactions (EFT / stock market) for example.
The size, alignment, and packing of your variables in memory can influence CPU cache hits/misses, which can influence the performance of your code by up to a factor of 100.
It's not a bad thing to know what's happening at a low level, as long as you use performance boosting tricks responsibly.
For example, I came to this thread because I needed to know the answer to exactly this question, so that I can size my arrays of primitives to fill an integer multiple of CPU cache lines because I need the code that is performing calculations over those arrays of primitives to execute quickly because I have a finite window in which I need my calculations to be ready for the consumer of the result.
In terms of RAM space, there is no real difference
If you use an array you have 11 Objects, 10 integers and the array, plus Arrays have other metadata inside. So using an array will take more memory space.
Now for real. This kind of question actually comes up in job interviews and exams, and that shows you what kind of interviewer or teacher you have... with so many layers of abstraction working down there in the VM and in the OS itself, what is the point on thinking on this stuff? Micro-optimizing memory...!
I mean if i create 10 integers and integer array of 10, will there be
any difference in total space occupied.
(integer array of 10) = (10 integers) + 1 integer
The last "+1 integer" is for index of array ( arrays can hold 2,147,483,647 amount of data, which is an integer). That means when you declare an array, say:
int[] nums = new int[10];
you actually reserve 11 int space from memory. 10 for array elements and +1 for array itself.