Memory allocation declaring a field

Memory allocation declaring a field - java

How much memory Java allocates for declaring fields like private char letter; and private int size; at the moment of constucting the object containing these fields?

This depends on the implementation of the virtual machine. The spec specifies that a char primitive type has a value range of 16 Bit, but it does not specify how a virtual machine has to store an object on the heap.
There's no need for such a detailed spec, because VM's don't have to be able to exchange or serialize raw objects from the heap.
To respond to your clarification in a comment: Again, it depends on the implementation, but there a couple of good reasons to allocate the memory for all class attributes once at the time the object is "created". If we decided for lazy allocation, then we'd have to add mechanics to dynamically resize objects on the heap at runtime which is pretty expensive.
If we reserve all space right away at the beginning, then we never have to resize or relocate data on the heap, because the datastructures can never grow or shrink in size.

In the Oracle/Sun JVM, each object is allocated on an 8-byte boundary. So adding a field may not increase the amount of memory used. However as a guide here are the sizes of primitives
type typical size
byte, boolean 1 byte
char, short 2 bytes
int, float 4 bytes
long, double 8 bytes
Whether the JVM is 32-bit or 64-bit makes no differences to the size of a primitive but it does change the default size of a reference.

I don't know the specificities of the JVM,
but if that can help you the char primitive type uses 16-bit (Unicode character) to store the data, and int uses 32-bits
http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
I guess you could test it by creating a very simple Java application and a very simple object.
Run the application without declaring fields and check how much memory it uses (Ctrl+Shift+Escape in Windows), and then re-run and check the difference when you do allocate these fields.

Fields in Java classes that store primitive types are initialised with default values when the object is created, so I would imagine the memory would be allocated then.

This is implementation dependent.
Early JVM implementations were closer to the class file format. In that case byte, short, char, int, float and references take up one slot; long and double two slots. So, effectively round size up to four bytes and that's how much memory it takes up in the object. Then the total for the object, including header, is often rounded up to 8 bytes for better memory alignment. For "compressed oops" (32 bit references on 64 bit platforms, where the bottom bits of the 64-bit address are always zero, allowing the reference to be shifted and more than 4 GB used whilst keeping references down to four bytes), there is strong pressure to align to bigger sizes.
But for the best part of a decade we have had 64-bit JVMs. That means more waste, including waste in terms of processor-memory bandwidth. So in modern implementations the object layout is compacted such that object uses as much memory as you would expect (plus header and alignment rounding).

Related

How precisely do Java arrays use memory in HotSpot (i.e. how much slop)?

C malloc implementations typically don't allocate the precise amount of memory requested but instead consume fixed-size runs of memory, e.g. with power-of-two sizes, so that an allocation of 1025 bytes actually takes a 2048-byte segment with 1023 bytes lost as slop.
Does HotSpot use a similar allocation mechanism for Java arrays? If so, what's the right way to allocate a Java array such that there's no slop? (E.g. should the array length be a power of two or maybe power of two minus some fixed amount of overhead?)

If you're asking about the language, the answer is: Its not specified (same as for C)
If you're asking about a specific implementation, check out that implementation. I believe for Hotspot its 8 bytes granularity; that is object sizes are rounded up to the next granularity boundary. If the question is about the heap size increase when there isn't enough free heap, then it depends on implementation, GC settings, heap size params and so on; making it impractical to answer precisely.
EDIT: Using a small reflection hack, accessing the sun.misc.Unsafe class (Oracle JRE only), object references can be converted to memory addresses; output the addresses of two consecutively allocated arrays to check for yourself.
And I basically asked the same question here: Determine the optimal size for array with respect to the JVM's memory granularity (Answers include an example of using the Unsafe class to check for object size)

How does Java structure memory for an object's members?

I'm trying to figure out how Java structures/allocates memory for objects. (Yes this is implementation specific. I'm using the Oracle 1.7 runtime for this.) I did some work on this here and here and the results are confusing.
First off, in both referenced links, when I allocated an array of objects, the equivalent of new Object[10000], it used 4 bytes per object. On a 32-bit system this makes perfect sense. But I'm on a 64-bit system so what's going on here?
Is Java limited to a 32-bit address space even on 64-bit systems?
Is Java limited to a 32-bit address space per array and each array object then has a pointer to where the elements are?
Something else?
Second, I compared the memory footprint of 8 booleans vs. a byte as the variables in a class. The 8 booleans requires 24 bytes/object or 3 bytes/boolean. The single byte approach requires 10 bytes/object.
What is going on with 3 bytes/boolean? I would understand 4 (making each an int) and definitely 1 (making each a byte. But 3?
And what's with expanding a byte to 10 bytes? I would understand 8 where it's expanding a byte to a native int (I'm on a 64-bit system). But what's with the other 2 bytes?
And in the case of different ways to create a RGB class it gets really weird.
For a class composed of 3 byte variables it uses 24 bytes/instance. Expensive but understandable as each uses an int.
So I tried where each class is a single int with the RGB stored in parts of the int (using bit shifting). And it's still 24 bytes/instance. Why on earth is Java taking 3 ints for storage? This makes no sense.
But the weirdest case is where the class has a single variable of "byte[] color = new byte3;" Note that the byte is allocated so it's not just a null pointer. This approach takes less memory than the other two approaches. How???
Any guidance as to what is going on here is appreciated. I have a couple of classes that get allocated a lot and the flywheel pattern won't work (these objects have their values changed all over the place).
And an associated question, does the order of declaring variables matter? Back in the old days when I did C++ programming, declaring "int, byte, int, byte" used 4 int's work of space while "int, int, byte, byte" used 3.

There are far too many questions here but I'll address two of them.
Is Java limited to a 32-bit address space even on 64-bit systems?
32-bit Java is limited to a 32-bit address space. 64-bit Java is not.
Is Java limited to a 32-bit address space per array and each array object then has a pointer to where the elements are?
A Java array is indexed by an int which is 32 bits.

Understood that you are on a 64bit system but are you running a 32bit or 64bit jvm? The JVM itself has an influence on many optimizations that will affect the results of your testing. There are many JVMs (although admittedly only a few are popular) and some may have better, or different, optimizations than others.

reference type size in java

Why does a reference type in java take 8 bytes? Why not less or more than 8 bytes?

Actually, it is nowhere specified how much bytes a reference variable shall have, and in fact, it is not everywhere the same.
Common virtual machines for 32-bit systems (i.e. systems with 32 bit adresses in the memory bus) usually use 32-bit (= 4 bytes, same as int and float) as the size for object references, while virtual machines for 64-bit systems often use the native address size 64 bits (= 8 bytes) for these. (Note that most 64 bit systems can run 32-bit programs, too, so often even there you'll be using a 32 bit VM.)
It is simply a matter of simplifying the implementation, if you can use an actual memory address for your references instead of something else.
As this increases the size of memory used (and often we don't actually need to access that much memory), from Java 7 on the 64-bit HotSpot VM will be able to use 32-bit references under certain conditions, i.e. when the heap is smaller than 32 GB (8·232 bytes). To convert them to a actual memory address, they are multiplied by 8 (as objects will be aligned on 8-byte-boundaries) and then added to the base address (if this is not zero). (For heaps smaller than 4 GB, we don't need the multiplication step.)
Other VMs might use similar tricks.

This is because a reference variable doesn't actually hold the object. It's a way to reach the object. The way JVM manages this is something of our least concern. You can think of it as an address to the location of the object in the heap. But it need not be as straight forward as an address.

Let me give an example to help you understand better
String myName = new String("John");
String yourName = new String("John");
if (myName == yourName)
{
System.out.println("Refereces point to Same objects on heap");
} else
{
System.out.println("Refereces point to different objects on heap");
}
and output is Refereces point to different objects on heap.
Here myName and yourName are two references which point to objects of type String(memory allocated on Heap). Note memory for reference variables is allocated on stack and not on the Heap. Both the reference variables merely have the address(or similar unique abstraction) of the String object which resides on the Heap. The size of address will be constant for a particular architecture of OS. In case of 32 bits it will be 4 bytes where as for 64 bits it will be 8 bytes.
So size of objects pointed be the reverence variable may change but the size of references remain the same as they simply carry address which remains constant for any particular architecture.

8 Bytes is the size of a Long variable in Java and allows for 1.01457092 × 10^19 references to be referenced, that should be enough for most applications, and bigger than most hard drives can address considering that each of those reference points needs to also hold data...
Edit:
Having read one of the other answers I wanted to clarify that the reference is just a pointer to the data and doesn't contain the data, the Java architects decided to use 64 bit values to denote references because it gives you a huge address space.

Values of type reference can be thought of as pointers to objects. the 8 byte should adress this in the heap.
Read about The Structure of the Java Virtual Machine

In Java, empty HashMap space allocation

How can i tell how much space a pre-sized HashMap takes up before any elements are added? For example how do i determine how much memory the following takes up?
HashMap<String, Object> map = new HashMap<String, Object>(1000000);

In principle, you can:
calculate it by theory:
look at the implementation of HashMap to figure out what this method does.
look at the implementation of the VM to know how much space the individual created objects take.
measure it somehow.
Most of the other answers are about the second way, so I'll look at the first one (in OpenJDK source, 1.6.0_20).
The constructor uses a capacity that is the next power of two >= your initialCapacity parameter, thus 1048576 = 2^20 in our case.
It then creates an new Entry[capacity] and assigns it to the table variable. (Additionally it assigns some primitive variables).
So, we now have one quite small HashMap object (it contains only 3 ints, one float and one reference variable), and one quite big Entry[] object. This array needs space for their array elements (which are normal reference variables) and some metadata (size, class).
So, it comes down to how big a reference variable is. This depends on VM implementation - usually in 32-bit VMs it is 32 bit (= 4 bytes), in 64-bit VMs 64 bit (= 8 bytes).
So, basically on 32-bit VMs your array takes 4 MB, on 64-bit VMs it takes 8 MB, plus some tiny administration data.
If you then fill your HashTable with mappings, each mapping corresponds to a Entry object. This entry object consists of one int and three references, taking about 24 bytes on 32-bit VMs, maybe the double on 64-bit VMs. Thus your 1000000-mappings HashMap (assuming an load factor > 1) would take ~28 MB on 32-bit-VMs and ~56 MB on 64-bit VMs.
Additionally to the key and value objects themselves, of course.

You could check memory usage before and after creation of the variable. For example:
long preMemUsage = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
HashMap<String> map = new HashMap<String>(1000000);
long postMemUsage = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();

The exact answer will depend on the version of Java you are using, the JVM vendor and the target platform, and is best determined by direct measurement, as described in other answers.
But as a simple estimate, the size is likely to be either ~4 * 2^20 or ~8 * 2^20 bytes, for a 32 bit or 64 bit jvm respectively.
Reasoning:
The Sun Java 1.6 implementation of HashMap has a fixed side top-level object and a table field that points to the array of references to hash chains.
In a newly created (empty) HashMap the references are all null and the array size is the next power of two larger that the supplied initialCapacity. (Yes ... I checked the source code.)
A reference occupies 4 bytes on a typical 32bit JVM and 8 bytes on a typical 64 bit JVM. Some 64 bit JVMs support compact references ("compressed oops"), but you need to set JVM options to enable this.
The top object has 5 fields including the table array reference, but this is a relatively small constant overhead.
The top object and the array have object header overheads, but these are constant and relatively small.
Thus the size of the table array dominates, and it is 2^20 (the next power of 2 greater than 1,000,000) multiplied by the size of a reference.
So, this tells you that setting a large initial capacity really does use a lot of memory. On the other hand, if the initial capacity is a good estimate of the map's capacity when fully populated, you will save significant amounts of time by setting it. (This avoids a number of cycles of reallocating the array and rebuilding of the hash chains.)

You could probably use a profiler like VisualVM and track memory use.
Have a look at this too: http://www.velocityreviews.com/forums/t148009-java-hashmap-size.html

I'd have a look at this article: http://www.javaworld.com/javaworld/javatips/jw-javatip130.html
In short, java does not have a C-style sizeof operator. You could use profiling tools, but IMO the above link gives the simplest solution.
Another piece of info that may be helpful: an empty java String consumes 40 bytes. One million of them would probably be at least 40MB...

I agree that a profiler is really the only way to tell. The other bit of relevant information is whether you're using a 32-bit or 64-bit JVM. The amount of overhead due to memory references (pointers) varies depending on that and whether you have compressed oops turned on. I've found that for smaller data sets the overhead of objects and pointers is significant.

In the latest version of Java 1.7 (I'm looking at 1.7.0_55) HashMap actually lazily instantiates its internal table. It's only instantiated when put() is called - see the private method "inflateTable()". So your HashMap, before you add anything to it at least, will occupy only the handful of bytes of object overhead and instance fields.

You should be able to use VisualVM (comes with JDK 6 or can be downloaded) to create a memory snapshot and inspect the allocated objects for their size.

Declaring 'long' over 'int' in Java

If int is enough for a field, and if I use long for some reason, would that cost me more memory?

In Java, yes, a long is 8 bytes wide and an integer is 4 bytes wide. This Java tutorial goes over the primitive data types. If you multiply the number of allocations by a certain amount (say, if you're allocating five million of these variables), the difference becomes more than negligible. For the average usage, however, it doesn't matter as much.
(You're already using Java, memory's kind of all over the place anyway.)
In native languages, there's a performance consideration; a 32-bit value can be held in a single register on a 32-bit architecture but not a 64-bit value; on 64-bit architectures, obviously, it can. I'm not sure what kind of optimization Java does on its native integers, but this might be true in its runtime as well. There's alignment issues to worry about, as well -- you see this more with using shorts and bytes, though.
Best practice would be to use the type you need. If the value will never be over 2^31, don't use a long.

Assuming from your previous questions that you mean to ask this in the scope of Java, the int data type is four bytes and the long data type is eight bytes.
However, whether the difference in size actually means a difference in memory usage depends on the situation.
If it's a local variable, it's allocated on the stack. As the stack is already allocated, using more stack space will not use more memory, provided of course that you don't exhaust the stack.
If it a member of a class, it will depend on how the members are aligned. Sometimes members are not stacked compactly in memory, but padding is used so that some members start at an even address. If you for example have a byte and an int in a class, there will likely be three bytes of padding between them so that the int starts at the next address divisible by four.

intis 32 bit and long is 64 bit. long takes twice as much memory (which is pretty insignificant for most applications).

long in Java is 64bit, and int is 32bit, so obviously longs uses more memory (8 bytes instead of 4 bytes).

ifwdev guessed correctly. Java defines an int as a 32-bit signed integer, and a long as a 64-bit signed integer. If you declare a variable as a long, then yes, it will take twice as much memory as the same variable declared as an int. In general, int is typically the "default" numeric type, even for values that could be contained in smaller types like a short. Unless you have a particular reason for requiring values greater than 2^31-1, use an int.

...would that cost me more memory?
You'll be using twice as memory
Before worrying about if you're using more memory or not, you should profile.
To use 1 megabyte extra of ram using long rather than int you'll have to declare: 262,144 long variables (or use them indirectly in your program ).
So if for some reason you declare one or two long variables when int's should be used, you'll be using 4 or 8 bytes more of memory. Not too much to worry about ( I mean, there might be worst memory problems in your app )
Taken from the Java Tutorial here's the definition of int and long
int: The int data type is a 32-bit signed two's complement integer. It has a minimum value of -2,147,483,648 and a maximum value of 2,147,483,647 (inclusive). For integral values, this data type is generally the default choice unless there is a reason (like the above) to choose something else. This data type will most likely be large enough for the numbers your program will use, but if you need a wider range of values, use long instead.
long: The long data type is a 64-bit signed two's complement integer. It has a minimum value of -9,223,372,036,854,775,808 and a maximum value of 9,223,372,036,854,775,807 (inclusive). Use this data type when you need a range of values wider than those provided by int.
But remember: "Premature optimization is the root of all evil" according to Donald Knuth ( according to me Copy/Paste is the root of all evil though )

If you know your data will fit in a specific data type (say short int in C), the only reason to use a bigger one is performance right? And if that's your goal, regardless of how marginal your performance gain is, as a general rule of thumb you want to use a size that matches your architecture's size (so for a normal 32-bit target system, you'd use a 32-bit type).
If you target more than one system, you can use a data type that matches the most often used one.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.