In Java, is a local variable allocated a maximum memory space of 32 bits? If it is, what happens if I use a local variable of data type long (64 bits) in a method in my java code? In what way would memory be allocated to this variable?
Whenever i googled to get an answer, I got explanations related only to java memory area which explained where (in the frame of the concerned method in stack..that is OK i know this) a local variable gets memory which is certainly not a relevant response to my query.
The original VM specification is actually really messed up with regards to local variables, each local variable is reseved a "slot" on the stack (simply an index number) and each slot is supposed to hold 4 bytes. So each variable is mapped to one "slot". But variables that occupy more than 4 bytes (double, long) need to occupy two consecutive slots. References do occupy one slot however, although they may be 8 bytes on a 64 bit VM. There was no 64 bit VM when this was specified, hence the specification assumed 32 bit references.
In practice, I'm pretty sure any current VM will remap the stack slots as it sees fit and the actual size reserved on the stack will also be decided by the VM. So all that remains is a peculiar slot allocation scheme in the byte code, all that actual "slot" stuff is purely on the bytecode level - the VM doesn't need to physically adhere to the slot layout the bytecode specified.
Take a look into the bytecode specification: http://docs.oracle.com/javase/specs/jvms/se5.0/html/Overview.doc.html#17257
JVMs usually word-align local variables on the stack, which means that they take up 32 bits on a 32 bit JVM (except for longs and doubles, which will take up 64 bits) and that they will take up 64 bits on a 64 bit JVM. The JVM is allowed to pack the variables so that they take up less space (e.g. putting 4 bytes in a 32 bit word rather than putting 4 bytes in 4 separate words), but this is slower than having all of the variables be word aligned since the processor will have to unpack them before using them.
Related
I'm trying to figure out how Java structures/allocates memory for objects. (Yes this is implementation specific. I'm using the Oracle 1.7 runtime for this.) I did some work on this here and here and the results are confusing.
First off, in both referenced links, when I allocated an array of objects, the equivalent of new Object[10000], it used 4 bytes per object. On a 32-bit system this makes perfect sense. But I'm on a 64-bit system so what's going on here?
Is Java limited to a 32-bit address space even on 64-bit systems?
Is Java limited to a 32-bit address space per array and each array object then has a pointer to where the elements are?
Something else?
Second, I compared the memory footprint of 8 booleans vs. a byte as the variables in a class. The 8 booleans requires 24 bytes/object or 3 bytes/boolean. The single byte approach requires 10 bytes/object.
What is going on with 3 bytes/boolean? I would understand 4 (making each an int) and definitely 1 (making each a byte. But 3?
And what's with expanding a byte to 10 bytes? I would understand 8 where it's expanding a byte to a native int (I'm on a 64-bit system). But what's with the other 2 bytes?
And in the case of different ways to create a RGB class it gets really weird.
For a class composed of 3 byte variables it uses 24 bytes/instance. Expensive but understandable as each uses an int.
So I tried where each class is a single int with the RGB stored in parts of the int (using bit shifting). And it's still 24 bytes/instance. Why on earth is Java taking 3 ints for storage? This makes no sense.
But the weirdest case is where the class has a single variable of "byte[] color = new byte3;" Note that the byte is allocated so it's not just a null pointer. This approach takes less memory than the other two approaches. How???
Any guidance as to what is going on here is appreciated. I have a couple of classes that get allocated a lot and the flywheel pattern won't work (these objects have their values changed all over the place).
And an associated question, does the order of declaring variables matter? Back in the old days when I did C++ programming, declaring "int, byte, int, byte" used 4 int's work of space while "int, int, byte, byte" used 3.
There are far too many questions here but I'll address two of them.
Is Java limited to a 32-bit address space even on 64-bit systems?
32-bit Java is limited to a 32-bit address space. 64-bit Java is not.
Is Java limited to a 32-bit address space per array and each array object then has a pointer to where the elements are?
A Java array is indexed by an int which is 32 bits.
Understood that you are on a 64bit system but are you running a 32bit or 64bit jvm? The JVM itself has an influence on many optimizations that will affect the results of your testing. There are many JVMs (although admittedly only a few are popular) and some may have better, or different, optimizations than others.
A while ago I read that java byte which is an 8 bits is stored internally as an int. I don't seem to find any info online that affirms this.
Thank you for taking the time to answer my question!
What about C++ char? Is it stored as an 8 bits or 32 bits?
How a byte (or any other Java value, for that matter) is stored is not specified by the JLS or the JVMS (the closest you'll find is the abstract specification on the level of the JVM, but that still doesn't say how it's stored natively). It is usually stored in the way which is most appropriate to the hardware architecture at hand, and that is usually 32 bits (or even 64).
Well, if you look at how methods are represented in a class file, you will notice that method parameters are loaded onto a method frame's execution stack with the same byte code instruction if they are bytes, ints, booleans, shorts or chars. This implies that they need to take the same size within a method frame what usually takes 32 bit.
As of storing bytes on the heap, most JVM implementations choose to store bytes with 32 bit while byte arrays are stored with 8 bit per array entry. This is however not specified in the JLS or the JVMS. If you wanted to implement your own JVM, you could use any amount of bit to store a byte and still pass the Java TCK compatibility tests.
So to say: What you say is not a manifestured truth but it is still correct most of the time.
How much memory Java allocates for declaring fields like private char letter; and private int size; at the moment of constucting the object containing these fields?
This depends on the implementation of the virtual machine. The spec specifies that a char primitive type has a value range of 16 Bit, but it does not specify how a virtual machine has to store an object on the heap.
There's no need for such a detailed spec, because VM's don't have to be able to exchange or serialize raw objects from the heap.
To respond to your clarification in a comment: Again, it depends on the implementation, but there a couple of good reasons to allocate the memory for all class attributes once at the time the object is "created". If we decided for lazy allocation, then we'd have to add mechanics to dynamically resize objects on the heap at runtime which is pretty expensive.
If we reserve all space right away at the beginning, then we never have to resize or relocate data on the heap, because the datastructures can never grow or shrink in size.
In the Oracle/Sun JVM, each object is allocated on an 8-byte boundary. So adding a field may not increase the amount of memory used. However as a guide here are the sizes of primitives
type typical size
byte, boolean 1 byte
char, short 2 bytes
int, float 4 bytes
long, double 8 bytes
Whether the JVM is 32-bit or 64-bit makes no differences to the size of a primitive but it does change the default size of a reference.
I don't know the specificities of the JVM,
but if that can help you the char primitive type uses 16-bit (Unicode character) to store the data, and int uses 32-bits
http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
I guess you could test it by creating a very simple Java application and a very simple object.
Run the application without declaring fields and check how much memory it uses (Ctrl+Shift+Escape in Windows), and then re-run and check the difference when you do allocate these fields.
Fields in Java classes that store primitive types are initialised with default values when the object is created, so I would imagine the memory would be allocated then.
This is implementation dependent.
Early JVM implementations were closer to the class file format. In that case byte, short, char, int, float and references take up one slot; long and double two slots. So, effectively round size up to four bytes and that's how much memory it takes up in the object. Then the total for the object, including header, is often rounded up to 8 bytes for better memory alignment. For "compressed oops" (32 bit references on 64 bit platforms, where the bottom bits of the 64-bit address are always zero, allowing the reference to be shifted and more than 4 GB used whilst keeping references down to four bytes), there is strong pressure to align to bigger sizes.
But for the best part of a decade we have had 64-bit JVMs. That means more waste, including waste in terms of processor-memory bandwidth. So in modern implementations the object layout is compacted such that object uses as much memory as you would expect (plus header and alignment rounding).
Why does a reference type in java take 8 bytes? Why not less or more than 8 bytes?
Actually, it is nowhere specified how much bytes a reference variable shall have, and in fact, it is not everywhere the same.
Common virtual machines for 32-bit systems (i.e. systems with 32 bit adresses in the memory bus) usually use 32-bit (= 4 bytes, same as int and float) as the size for object references, while virtual machines for 64-bit systems often use the native address size 64 bits (= 8 bytes) for these. (Note that most 64 bit systems can run 32-bit programs, too, so often even there you'll be using a 32 bit VM.)
It is simply a matter of simplifying the implementation, if you can use an actual memory address for your references instead of something else.
As this increases the size of memory used (and often we don't actually need to access that much memory), from Java 7 on the 64-bit HotSpot VM will be able to use 32-bit references under certain conditions, i.e. when the heap is smaller than 32 GB (8·232 bytes). To convert them to a actual memory address, they are multiplied by 8 (as objects will be aligned on 8-byte-boundaries) and then added to the base address (if this is not zero). (For heaps smaller than 4 GB, we don't need the multiplication step.)
Other VMs might use similar tricks.
This is because a reference variable doesn't actually hold the object. It's a way to reach the object. The way JVM manages this is something of our least concern. You can think of it as an address to the location of the object in the heap. But it need not be as straight forward as an address.
Let me give an example to help you understand better
String myName = new String("John");
String yourName = new String("John");
if (myName == yourName)
{
System.out.println("Refereces point to Same objects on heap");
} else
{
System.out.println("Refereces point to different objects on heap");
}
and output is Refereces point to different objects on heap.
Here myName and yourName are two references which point to objects of type String(memory allocated on Heap). Note memory for reference variables is allocated on stack and not on the Heap. Both the reference variables merely have the address(or similar unique abstraction) of the String object which resides on the Heap. The size of address will be constant for a particular architecture of OS. In case of 32 bits it will be 4 bytes where as for 64 bits it will be 8 bytes.
So size of objects pointed be the reverence variable may change but the size of references remain the same as they simply carry address which remains constant for any particular architecture.
8 Bytes is the size of a Long variable in Java and allows for 1.01457092 × 10^19 references to be referenced, that should be enough for most applications, and bigger than most hard drives can address considering that each of those reference points needs to also hold data...
Edit:
Having read one of the other answers I wanted to clarify that the reference is just a pointer to the data and doesn't contain the data, the Java architects decided to use 64 bit values to denote references because it gives you a huge address space.
Values of type reference can be thought of as pointers to objects. the 8 byte should adress this in the heap.
Read about The Structure of the Java Virtual Machine
How can i tell how much space a pre-sized HashMap takes up before any elements are added? For example how do i determine how much memory the following takes up?
HashMap<String, Object> map = new HashMap<String, Object>(1000000);
In principle, you can:
calculate it by theory:
look at the implementation of HashMap to figure out what this method does.
look at the implementation of the VM to know how much space the individual created objects take.
measure it somehow.
Most of the other answers are about the second way, so I'll look at the first one (in OpenJDK source, 1.6.0_20).
The constructor uses a capacity that is the next power of two >= your initialCapacity parameter, thus 1048576 = 2^20 in our case.
It then creates an new Entry[capacity] and assigns it to the table variable. (Additionally it assigns some primitive variables).
So, we now have one quite small HashMap object (it contains only 3 ints, one float and one reference variable), and one quite big Entry[] object. This array needs space for their array elements (which are normal reference variables) and some metadata (size, class).
So, it comes down to how big a reference variable is. This depends on VM implementation - usually in 32-bit VMs it is 32 bit (= 4 bytes), in 64-bit VMs 64 bit (= 8 bytes).
So, basically on 32-bit VMs your array takes 4 MB, on 64-bit VMs it takes 8 MB, plus some tiny administration data.
If you then fill your HashTable with mappings, each mapping corresponds to a Entry object. This entry object consists of one int and three references, taking about 24 bytes on 32-bit VMs, maybe the double on 64-bit VMs. Thus your 1000000-mappings HashMap (assuming an load factor > 1) would take ~28 MB on 32-bit-VMs and ~56 MB on 64-bit VMs.
Additionally to the key and value objects themselves, of course.
You could check memory usage before and after creation of the variable. For example:
long preMemUsage = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
HashMap<String> map = new HashMap<String>(1000000);
long postMemUsage = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
The exact answer will depend on the version of Java you are using, the JVM vendor and the target platform, and is best determined by direct measurement, as described in other answers.
But as a simple estimate, the size is likely to be either ~4 * 2^20 or ~8 * 2^20 bytes, for a 32 bit or 64 bit jvm respectively.
Reasoning:
The Sun Java 1.6 implementation of HashMap has a fixed side top-level object and a table field that points to the array of references to hash chains.
In a newly created (empty) HashMap the references are all null and the array size is the next power of two larger that the supplied initialCapacity. (Yes ... I checked the source code.)
A reference occupies 4 bytes on a typical 32bit JVM and 8 bytes on a typical 64 bit JVM. Some 64 bit JVMs support compact references ("compressed oops"), but you need to set JVM options to enable this.
The top object has 5 fields including the table array reference, but this is a relatively small constant overhead.
The top object and the array have object header overheads, but these are constant and relatively small.
Thus the size of the table array dominates, and it is 2^20 (the next power of 2 greater than 1,000,000) multiplied by the size of a reference.
So, this tells you that setting a large initial capacity really does use a lot of memory. On the other hand, if the initial capacity is a good estimate of the map's capacity when fully populated, you will save significant amounts of time by setting it. (This avoids a number of cycles of reallocating the array and rebuilding of the hash chains.)
You could probably use a profiler like VisualVM and track memory use.
Have a look at this too: http://www.velocityreviews.com/forums/t148009-java-hashmap-size.html
I'd have a look at this article: http://www.javaworld.com/javaworld/javatips/jw-javatip130.html
In short, java does not have a C-style sizeof operator. You could use profiling tools, but IMO the above link gives the simplest solution.
Another piece of info that may be helpful: an empty java String consumes 40 bytes. One million of them would probably be at least 40MB...
I agree that a profiler is really the only way to tell. The other bit of relevant information is whether you're using a 32-bit or 64-bit JVM. The amount of overhead due to memory references (pointers) varies depending on that and whether you have compressed oops turned on. I've found that for smaller data sets the overhead of objects and pointers is significant.
In the latest version of Java 1.7 (I'm looking at 1.7.0_55) HashMap actually lazily instantiates its internal table. It's only instantiated when put() is called - see the private method "inflateTable()". So your HashMap, before you add anything to it at least, will occupy only the handful of bytes of object overhead and instance fields.
You should be able to use VisualVM (comes with JDK 6 or can be downloaded) to create a memory snapshot and inspect the allocated objects for their size.