I'm trying to understand what is the memory footprint of an object in Java. I read this and other docs on object and memory in Java.
However, when I'm using the sizeof Java library or visualvm, I get two different results where none of them feet what I could expect according to the previous reference (http://www.javamex.com).
For my test, I'm using Java SE 7 Developer Preview on a 64-bits Mac with java.sizeof 0.2.1 and visualvm 1.3.5.
I have three classes, TestObject, TestObject2, TestObject3.
public class TestObject
{
}
public class TestObject2 extends TestObject
{
int a = 3;
}
public class TestObject3 extends TestObject2
{
int b = 4;
int c = 5;
}
My main class:
public class memoryTester
{
public static void main(String[] args) throws Throwable
{
TestObject object1 = new TestObject();
TestObject2 object2 = new TestObject2();
TestObject3 object3 = new TestObject3();
int sum = object2.a + object3.b + object3.c;
System.out.println(sum);
SizeOf.turnOnDebug();
System.out.println(SizeOf.humanReadable(SizeOf.deepSizeOf(object1)));
System.out.println(SizeOf.humanReadable(SizeOf.deepSizeOf(object2)));
System.out.println(SizeOf.humanReadable(SizeOf.deepSizeOf(object3)));
}
}
With java.SizeOf() I get:
{ test.TestObject
} size = 16.0b
16.0b
{ test.TestObject2
a = 3
} size = 16.0b
16.0b
{ test.TestObject3
b = 4
c = 5
} size = 24.0b
24.0b
With visualvm I have:
this (Java frame) TestObject #1 16
this (Java frame) TestObject2 #1 20
this (Java frame) TestObject3 #1 28
According to documentations I read over Internet, as I'm in 64-bits, I should have an object header of 16 bytes, ok for TestObject.
Then for TestObject2 I should add 4 bytes for the integer field giving the 20 bytes, I should add again 4 bytes of padding, giving a total size of 24 bytes for TestObject2. Am I wrong?
Continuing that way for TestObject3, I have to add 8 bytes more for the two integer fields which should give 32 bytes.
VisualVm seems to ignore padding whereas java.sizeOf seems to miss 4 bytes as if there were included in the object header. I can replace an integer by 4 booleans it gives the same result.
Questions:
Why these two tools give different results?
Should we have padding?
I also read somewhere (I did'nt find back the link) that between a class and its subclass there could be some padding, is it right? In that case, an inherited tree of classes could have some memory overhead?
Finally, is there some Java spec/doc which details what Java is doing?
Thanks for your help.
Update:
To answer the comment of utapyngo, to get the size of the objects in visualvm, I create a heapdump, then in the "Classes" part I check the column "size" next after the column "instances". The number of instances if 1 for each kind of objects.
To answer comment of Nathaniel Ford, I initialized each fieds and then did a simple sum with them in my main method to make use of them. It didn't change the results.
Yes padding can happen. It is also possible for objects on the stack to get optimised out entirely. Only the JVM knows the exact sizes at any point in time. As such techniques to approximate the size from within the Java language all tend to disagree, tools that attach to the JVM tend to be the most accurate however. The three main techniques of implementing sizeOf within Java that I am aware of are:
serialize the object and return the length of those bytes
(clearly wrong, but useful for relative comparisons)
List item reflection,
and hard coded size constants for each field found on an object. can
be tuned to be kinda accurate but changes in the JVM and padding
that the JVM may or may not be doing will throw it.
List item create loads
of objects, run gc and compare changes in jvm heap size
None of these techniques are accurate.
If you are running on the Oracle JVM, on or after v1.5. Then there is a way to read the size of an object straight out of the C structure used by the Java runtime. Not a good idea for production, and get it wrong then you can crash the JVM. But here is a blog post that you may find interesting if you wish to have a go at it: http://highlyscalable.wordpress.com/2012/02/02/direct-memory-access-in-java/
As for documentation on what Java is actually doing, that is JVM specific, version specific and potentially configuration specific too. Each implementation is free to handle objects differently. Even to the extent of optimising objects out entirely, for example, objects that are not passed out from the stack are free not to be allocated on the heap. Some JVMs may even manage to keep the object within the CPU registers entirely. Not your case here, but I include it as an example as to why getting the true size of Java objects is tricky.
So best to take any sizeOf values that you get with a pinch of salt and treat it as a 'guideline' measurement only.
Related
Or should I consider refactoring my virtual indexing method(and its class) into a code-duplicated but faster one?
The issue I'm stuck at, I had some duplicated code, then refactored them and unified into a single class with just single virtual method in child classes only to minimize future code duplications. Now its %50 slower than before to accomplish this:
arr[i]=3.14f; // arr is derived from a base class with `[]` override
(so the derived class implementation is used).
but it became %500 easier to add new types now.
How many if-else checks in a non-virtual method makes it equally fast as a virtual one without if-else checks inside?(for todays 20-30 length pipelined cpus) float+char+double+some other structs = there would be more than 15 different types in my library so 15x code duplication would make the code %1500 harder to implement/refactor without virtual methods.
Example of my issue:
// implemented IList because C# arrays instead of this,
// can be used in same wrapper property too!
// Reduced even more code duplication.
public class Foo<T>:IList<T>
{
public virtual T this [int i]
{ ... }
}
public unsafe class Bar:Foo<byte>
{
public override byte this[int i]
{
get
{
return *(pByte + i);
}
set
{
*(pByte + i) = value;
}
}
}
}
Bar b=new Bar(); // Can't use Foo<byte>
// because I denied that with making its constructor `internal`
// because its mis-use would generate undefined behaviour(more than an out-of-bounds access) in a random time in a random place.
b[400]=50;
The reason I have to duplicate code without virtual is, no pointer is allowed for T generic types.
The reason I have to use pointers is, I have non-managed fast gpgpu C++ arrays to be worked likes just as same as pure C# when looked from outside.
The reason I had to use unmanaged arrays for gpgpu is, they work at top speed when they are aligned to unobtainable values like 4096 and needed to be pinned and also to reduce C# - C++ transition overheads.
Note: maybe it is not only virtual, but also the IList<T> interface contributing to slowness. Many answers say it comes with a cost but if Java can work around it, why can't C#?
Here is the environment:
.Net 3.5
MSVS 2015 community ed. all optimizations enabled.
windows 10 64 bit
project 64 bit release
c3060 cpu with a single channel ddr3 ram
for benchmarking, heating phase is added, timings are taken after many iterations and used in real data visualizations.
So a number of variations of a question exist here on stackoverflow that ask how to measure the size of an object (for example this one). And the answers point to the fact, without much elaboration, that it is not possible. Can somebody explain in length why is it not possible or why does it not make sense to measure object sizes?
I guess from the tags that you are asking about measurements of object sizes in Java and C#. Don't know much about C# therefore the following only pertains to Java.
Also there is a difference between the shallow and detained size of a single object and I suppose you are asking about the shallow size (which would be the base to derive the detained size).
I also interpret your term managed environment that you only want to know the size of an object at runtime in a specific JVM (not for instance calculating the size looking only at source code).
My short answers first:
Does it make sense to measure object sizes? Yes it does. Any developer of an application which runs under memory constraints is happy to know the memory implications of class layouts and object allocations.
Is it impossible to measure in managed environments? No it is not. The JVM must know about the size of its objects and therefore must be able to report the size of an object. If we only had a way to ask for it.
Long answer:
There are plenty of reasons why the object size cannot be derived from the class definition alone, for example:
The Java language spec only gives lowerbound memory requirements for primitive types. A int consumes at least 4 bytes, but the real size is up to the VM.
Not sure what the language spec tells about the size of references. Is there any constraint on the number of possible objects in a JVM (which would have implications for the size of internal storage for object references)? Today's JVMs use 4 bytes for a reference pointer.
JVMs may (and do) pad the object bytes to align at some boundary which may extend the object size. Todays JVMs usually align object memory at a 8 byte boundary.
But all these reasons do not apply to a JVM runtime which uses actual memory layouts, eventually allows its generational garbage collector to push objects around, and must therefore be able to report object sizes.
So how do we know about object sizes at runtime?
In Java 1.5 we got java.lang.instrument.Instrumentation#getObjectSize(Object).
The Javadoc says:
Returns an implementation-specific approximation of the amount of
storage consumed by the specified object. The result may include some
or all of the object's overhead, and thus is useful for comparison
within an implementation but not between implementations. The estimate
may change during a single invocation of the JVM.
Reading with a grain of salt this tells me that there is a reasonable way to get the exact shallow size of an object during one point at runtime.
Getting size of object is easily possible.
Getting object size may have little overhead if the object is large and we use IO streams to get the size.
If you have to get size of larger objects very frequently, you have to be careful.
Have a look at below code.
import java.io.*;
class ObjectData implements Serializable{
private int id=1;;
private String name="sunrise76";
private String city = "Newyork";
private int dimensitons[] = {20,45,789};
}
public class ObjectSize{
public static void main(String args[]){
try{
ObjectData data = new ObjectData();
ByteArrayOutputStream b = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(b);
oos.writeObject(data);
System.out.println("Size:"+b.toByteArray().length);
}catch(Exception err){
err.printStackTrace();
}
}
}
I am studying java, and I remember reading somewhere that java objects, had some overhead inside the JVM, which was used for administration reasons by the virtual machine. So my question is, can someone tell me if and how I can get an object's total size in the HotSpot JVM, along with any overhead it may come with?
You can't get the overhead directly. The amount of overhead is implementation dependent, and can vary based on a number of factors (e.g. the precise JVM version, and whether you are on a 32 or 64bit JVM).
However it is reasonably safe to assume that in typical modern JVM implementations like HotSpot, the overhead per object is between 8 and 16 bytes. Arrays typically have an overhead that is 4 bytes larger than other objects (to contain the integer array length).
See also:
In Java, what is the best way to determine the size of an object?
Memory usage of Java objects: general guide
I found this article rather informative, although I had some doubts by some of the values mentioned in the tables
Here is a snippet for object header, object overhead, array header, object reference. Hope it helps someone, if not the OP as it is a quite old question.
private static int OBJ_HEADER;
private static int ARR_HEADER;
private static int INT_FIELDS = 12;
private static int OBJ_REF;
private static int OBJ_OVERHEAD;
private static boolean IS_64_BIT_JVM;
static {
String arch = System.getProperty("sun.arch.data.model");
IS_64_BIT_JVM = (arch == null) || arch.contains("32");
OBJ_HEADER = IS_64_BIT_JVM ? 16 : 8;
ARR_HEADER = IS_64_BIT_JVM ? 24 : 12;
OBJ_REF = IS_64_BIT_JVM ? 8 : 4;
OBJ_OVERHEAD = OBJ_HEADER + INT_FIELDS + OBJ_REF + ARR_HEADER;
}
I should say that I know only the solution, but I haven't yet figured out why this works. This is why people should leave comments in their code... Oh, well, when I do figure it out, I will share the logic behind it.
What is the size of an empty class in C++ and Java?
Why is it not zero?
sizeof(); returns 1 in the case of C++.
Short Answer for C++:
The C++ standard explicitly says that a class can not have zero size.
Long Answer for C++:
Because each object needs to have a unique address (also defined in the standard) you can't really have zero sized objects.
Imagine an array of zero sized objects. Because they have zero size they would all line up on the same address location. So it is easier to say that objects can not have zero size.
Note:
Even though an object has a non zero size, if it actually takes up zero room it does not need to increase the size of derived class:
Example:
#include <iostream>
class A {};
class B {};
class C: public A, B {};
int main()
{
std::cout << sizeof(A) << "\n";
std::cout << sizeof(B) << "\n";
std::cout << sizeof(C) << "\n"; // Result is not 3 as intuitively expected.
}
g++ ty.cpp
./a.out
1
1
1
In the Java case:
There is no simple way to find out how much memory an object occupies in Java; i.e. there is no sizeof operator.
There are a few ways (e.g. using Instrumentation or 3rd party libraries) that will give you a number, but the meaning is nuanced1; see In Java, what is the best way to determine the size of an object?
The size of an object (empty or non-empty) is platform specific.
The size of an instance of an "empty class" (i.e. java.lang.Object) is not zero because the instance has implicit state associated with it. For instance, state is needed:
so that the object can function as a primitive lock,
to represent its identity hashcode,
to indicate if the object has been finalized,
to refer to the object's runtime class,
to hold the object's GC mark bits,
and so on.
Current Hotspot JVMs use clever tricks to represent the state in an object header that occupies two 32 bit words. (This expands in some circumstances; e.g. when a primitive lock is actually used, or after identityHashCode() is called.)
1 - For example, does the size of the string object created by new String("hello") include the size of that backing array that holds the characters? From the JVM perspective, that array is a separate object!
Because every C++ object needs to have a separate address, it isn't possible to have a class with zero size (other than some special cases related to base classes). There is more information in C++: What is the size of an object of an empty class? .
Because an object has to have an address in memory, and to have an address in memory, it has to occupy "some" memory. So, it is usually, in C++, the smallest possible amount, i.e. 1 char (but that might depend on the compiler). In Java, I wouldn't be so sure.. it might have some default data (more than just a placeholder like in C++), but it would be surprising if it was much more than in C++.
C++ requires that a normal instantiation of it have a size of at least 1 (could be larger, though I don't know of a compiler that does that). It allows, however, an "empty base class optimization", so even though the class has a minimum size of 1, when it's used as a base class it does not have to add anything to the size of the derived class.
I'd guess Java probably does pretty much the same. The reason C++ requires a size of at least 1 is that it requires each object to be unique. Consider, for example, an array of objects with size zero. All the objects would be at the same address, so you'd really only have one object. Allowing it to be zero sounds like a recipe for problems...
It's defined by the C++ standard as "a nonzero value", because an allocated object must have a nonzero size in order to have a distinct address. A class that inherits from an empty class, however, is not required to increase in size, barring the usual increase of a vtable if there are virtual functions involved.
I don't know if there is a sizeof() operator in java. What you can do is create an instance of the empty class (have it serializable), send it through a PipedOutputStream and read it as byte array - byteArray.length gives you the size.
Alternatively, write out the instance to a file using DataOutputStream, close the File, open it and file.length() will give you the size of the Object. Hope this helps, - M.S.
As others have pointed out, C++ objects cannot have zero size. Classes can have zero size only when they act as a subclass of a different class. Take a look at #Martin York's answer for a description with examples --and also look and vote the other answers that are correct to this respect.
In Java, in the hotspot VM, there is a memory overhead of 2 machine-words (usually 4 bytes in a 32 arch per word) per object to hold book keeping information together with runtime type information. For arrays a third word is required to hold the size. Other implementations can take a different amount of memory (the classic Java VM, according to the same reference took 3 words per object)
This question already has answers here:
What is the memory consumption of an object in Java?
(12 answers)
Closed 9 years ago.
What is the proper way to measure how much memory from the heap should be used to create new object of a certain type (let's talk about Integers to keep it simple)?
Can this value be calculated without experiment? What are the rules in that case? Are these rules strictly specified somewhere or they can vary from jvm to jvm?
It could vary from JVM to JVM.
You may like this blog post from an Oracle engineer:
In the case of a Java Integer on a 32-bit Hotspot JVM, the 32-bit payload (a Integer.value field) is accompanied by a 96 additional bits, a mark, a klass, and a word of alignment padding, for a total of 128 bits. Moreover, if there are (say) six references to this integer in the world (threads plus heap), those references also occupy 192 bits, for a total of 320 bits. On a 64-bit machine, everything is twice as big, at least at present: 256 bits in the object (which now includes 96 bits of padding), and 384 bits elsewhere. By contrast, six copies of an unboxed primitive integer occupy 192 bits
You might wanna look at Java instrumentation to find that out. Here is an example of the same.
In your case, as I believe you want to find size of objects from withing your application, you will make the Instrumentation object available globally (static ) so that you can access it from your application.
Code Copied from the link:
public class MyAgent {
private static volatile Instrumentation globalInstr;
public static void premain(String args, Instrumentation inst) {
globalInstr = inst;
}
public static long getObjectSize(Object obj) {
if (globalInstr == null)
throw new IllegalStateException("Agent not initted");
return globalInstr.getObjectSize(obj);
}
}
However, I believe you will be able to find the size of only objects (not primitive types, also you do not require to find them out as you already know them :-) )
Note that the getObjectSize() method does not include the memory used
by other objects referenced by the object passed in. For example, if
Object A has a reference to Object B, then Object A's reported memory
usage will include only the bytes needed for the reference to Object B
(usually 4 bytes), not the actual object.
To get a "deep" count of the memory usage of an object (i.e. which includes "subobjects" or objects referred to by the "main" object), then you can use the Classmexer agent available for beta download from this site.
That's not easy to do in Java: sizeof does not exist and alternate solutions, such as serializing the objects into a byte stream and looking at the resulting stream's length, don't work in all cases (e.g. strings).
However, see this quite complicated implementation using object graphs.