Steps in the memory allocation process for Java objects

Steps in the memory allocation process for Java objects - java

What happens in the memory when a class instantiates the following object?
public class SomeObject{
private String strSomeProperty;
public SomeObject(String strSomeProperty){
this.strSomeProperty = strSomeProperty;
}
public void setSomeProperty(String strSomeProperty){
this.strSomeProperty = strSomeProperty;
}
public String getSomeProperty(){
return this.strSomeProperty;
}
}
In class SomeClass1:
SomeObject so1 = new SomeObject("some property value");
In class SomeClass2:
SomeObject so2 = new SomeObject("another property value");
How is memory allocated to the newly instantiated object and its properties?

Let's step through it:
SomeObject so1 = new SomeObject("some property value");
... is actually more complicated than it looks, because you're creating a new String. It might be easier to think of as:
String tmp = new String("some property value");
SomeObject so1 = new SomeObject(tmp);
// Not that you would normally write it in this way.
(To be absolutely accurate - these are not really equivalent. In the original the 'new String' is created at compile time and is part of the .class image. You can think of this as a performance hack.)
So, first the JVM allocates space for the String. You typically don't know or care about the internals of the String implementation, so just take it on trust that a chunk of memory is being used to represent "some property value". Also, you have some memory temporarily allocated containing a reference to the String. In the second form, it's explicitly called tmp; in your original form Java handles it without naming it.
Next the JVM allocates space for a new SomeObject. That's a bit of space for Java's internal bookkeeping, and space for each of the object's fields. In this case, there's just one field, strSomeProperty.
Bear in mind that strSomeProperty is just a reference to a String. For now, it'll be initialised to null.
Next, the constructor is executed.
this.strSomeProperty = strSomeProperty;
All this does is copy the reference to the String, into your strSomeProperty field.
Finally, space is allocated for the object reference so1. This is set with a reference to the SomeObject.
so2 works in exactly the same way.

Determining Memory Usage in Java by Dr. Heinz M. Kabutz gives a precise answer, plus a program to calculate the memory usage. The relevant part:
The class takes up at least 8 bytes. So, if you say new Object(); you will allocate 8 bytes on the heap.
Each data member takes up 4 bytes, except for long and double which take up 8 bytes. Even if the data member is a byte, it will still take up 4 bytes! In addition, the amount of memory used is increased in 8 byte blocks. So, if you have a class that contains one byte it will take up 8 bytes for the class and 8 bytes for the data, totalling 16 bytes (groan!).
Arrays are a bit more clever. Primitives get packed in arrays, so if you have an array of bytes they will each take up one byte (wow!). The memory usage of course still goes up in 8 byte blocks.
As people have pointed out in the comments, Strings are a special case, because they can be interned. You can reason about the space they take up in the same way, but keep in mind that what looks like multiple copies of the same String may actually point to the same reference.

Points to remember:
When a method is called, a frame is created on the top of stack.
Once a method has completed execution, flow of control returns to the calling method and its corresponding stack frame is flushed.
Local variables are created in the stack.
Instance variables are created in the heap & are part of the object they belong to.
Reference variables are created in the stack.
Ref: http://www.javatutorialhub.com/java-stack-heap.html

Related

Java: Best Practice for Declaration and Initialization

I do consider myself quite experienced in Java but there are still some minor basic things that I'm not really sure of.
I always try to write maintainable, easy readable code and aim for highest efficency.
For example I only call a "new"-Operator only when it is really needed.
That is because I don't want to allocate memory unnecessarily.
But what about supporting Variables?
Lots of people tend to declare a String just to assign some long ass method call like this:
String helper = Class.method1().method2(param).getter();
I always wonder if that doesn't allocate more memory than needed.
The getter returns a new Object already and now even more memory is allocated by referencing it in addition.
When I have use this getter more than once a helper string is convenient, but if it is needed only once wouldn't it be better to pass that method directly instead of declaring a new variable?
Does this actually allocate memory to the heap?
Object a, b, c, d, e ,f , g, h, i, j ...;
I hope some more experienced Java guys than me can tell me how they handle basic stuff like this. thanks! :)

Does this actually allocate memory to the heap?
No, it is allocated on the stack. Global variables are allocated on the heap.
When your method finishes it will return the allocated memoryspace it took from the stack, and the variable will no longer be of any concern.
I always wonder if that doesn't allocate more memory than needed
It will allocate memory on the stack for a reference to the variable. This reference is not needed if you pass the function as an argument. However, the function still has to be evaluated and it's returned value will be placed on the stack. If you declare a variable for it, a few extra writes and reads has to be performed, which should in most cases not be of any concern with respect to todays efficient computers.

If the member can only be set via an accessor (a "setter" method), I prefer the first style. It provides a hint that the initialized value is the default upon construction.
If the member can be specified during construction, I generally pass the default value to an appropriate constructor from constructor with fewer parameters. For example,
final class Example {
private final String name;
Example() {
this("My Example");
}
Example(String name) {
this.name = name;
}
}

Why can't object size be measured in a managed environment?

So a number of variations of a question exist here on stackoverflow that ask how to measure the size of an object (for example this one). And the answers point to the fact, without much elaboration, that it is not possible. Can somebody explain in length why is it not possible or why does it not make sense to measure object sizes?

I guess from the tags that you are asking about measurements of object sizes in Java and C#. Don't know much about C# therefore the following only pertains to Java.
Also there is a difference between the shallow and detained size of a single object and I suppose you are asking about the shallow size (which would be the base to derive the detained size).
I also interpret your term managed environment that you only want to know the size of an object at runtime in a specific JVM (not for instance calculating the size looking only at source code).
My short answers first:
Does it make sense to measure object sizes? Yes it does. Any developer of an application which runs under memory constraints is happy to know the memory implications of class layouts and object allocations.
Is it impossible to measure in managed environments? No it is not. The JVM must know about the size of its objects and therefore must be able to report the size of an object. If we only had a way to ask for it.
Long answer:
There are plenty of reasons why the object size cannot be derived from the class definition alone, for example:
The Java language spec only gives lowerbound memory requirements for primitive types. A int consumes at least 4 bytes, but the real size is up to the VM.
Not sure what the language spec tells about the size of references. Is there any constraint on the number of possible objects in a JVM (which would have implications for the size of internal storage for object references)? Today's JVMs use 4 bytes for a reference pointer.
JVMs may (and do) pad the object bytes to align at some boundary which may extend the object size. Todays JVMs usually align object memory at a 8 byte boundary.
But all these reasons do not apply to a JVM runtime which uses actual memory layouts, eventually allows its generational garbage collector to push objects around, and must therefore be able to report object sizes.
So how do we know about object sizes at runtime?
In Java 1.5 we got java.lang.instrument.Instrumentation#getObjectSize(Object).
The Javadoc says:
Returns an implementation-specific approximation of the amount of
storage consumed by the specified object. The result may include some
or all of the object's overhead, and thus is useful for comparison
within an implementation but not between implementations. The estimate
may change during a single invocation of the JVM.
Reading with a grain of salt this tells me that there is a reasonable way to get the exact shallow size of an object during one point at runtime.

Getting size of object is easily possible.
Getting object size may have little overhead if the object is large and we use IO streams to get the size.
If you have to get size of larger objects very frequently, you have to be careful.
Have a look at below code.
import java.io.*;
class ObjectData implements Serializable{
private int id=1;;
private String name="sunrise76";
private String city = "Newyork";
private int dimensitons[] = {20,45,789};
}
public class ObjectSize{
public static void main(String args[]){
try{
ObjectData data = new ObjectData();
ByteArrayOutputStream b = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(b);
oos.writeObject(data);
System.out.println("Size:"+b.toByteArray().length);
}catch(Exception err){
err.printStackTrace();
}
}
}

Java String Immutability and Using same string value to create a new string

I know that the title of the question is not very clear, sorry about that, did not know how to put it up. I have a very basic java implementation question which I want to focus on application performance, but it also involves String creation pattern in java.
I understand the immutability concept of Strings in Java. What I am not sure about is that, I have read somewhere that the following will not make two different String objects:
String name = "Sambhav";
String myName= "Sambhav";
I want to know how does Java do that? Does it actually look for a String value in the program memory and check for its existence and if it does not exist then creates a new String object?
In that case obviously it is saving memory but there are performance issues.
Also lets say I have a code like this:
public void some_method(){
String name = "Sambhav";
System.out.println(name); // or any random stufff
}
Now on each call of this function, is there a new String being made and added to memory or am I using the same String object?
I am just curious to know about the insights of how all this is happening?
Also if we say that
String name = "Sambhav";
String myName= "Sambhav";
will not create a new object because of reference, what about
String name = new String("Sambhav");
String myName= new String("Sambhav");
Will Java still be able to catch that the string are the same and just point myName to the same object as created in the previous statement?

Strings are internally char arrays with some inherent capabilities to work with the underlying char array. Eg. subString(int), split(String) methods.
Strings are immutable which means any effort made to change a String reference create a new String and allocate memory for that. As below
line 1. String a = new String("SomeString");
line 2. a = "SomeStringChanged";
line 1 allocate memory with "SomeString" referenced by variable a and add "SomeString" to String Pool
line 2 allocate memory in String Pool with "SomeStringChanged" and referenced by a. i.e a is not pointing to "SomeString" now and memory occupied by "SomeString" is available for gc now.
No reuse here
line 3. String b = "SomeStringChanged";
Now the literal "SomeStringChanged" is reused by variable a and b. i.e they are referring to the same memory location, in fact to a location called the 'String Pool'.
line 4. a = new String("SomeStringChanged");
Now a new allocation is done which contains "SomeStringChanged" and referenced by a
There is no reuse happening now. (the char array SomeStringChanged is already there in String Pool. So no String Pool allocation happen)
line 5. a = new String("SomeStringChanged").intern();
Now the allocation created during line 4 is discarded and variable a and b are referring to same location in the String Pool which contains "SomeStringChanged". There is reuse of the same char array here. The credit goes to intern() method
line 6. String x = new String("SomeX");
line 7. String y = "SomeX";
Line 6 will create an allocation for SomeX in the heap and in String Pool. The char array is duplicated.
Line 7 will not allocate any memory for SomeX since its already there in the String Pool
Line 8 String s = new String(someStringVariable);
Line 8 will only allocate single memory location in the heap and not in the String Pool.
In conclusion the reuse of a char array of string is only possible if a String reference is declared as a literal or the String object is interned i.e Only these two can make use of a String pool (which is in fact the idea behind char array reuse).

String that you put in quotes in you source files "like that" are compile-time constants and in case their contents match they are represented by a single entry in a constant pool inside your class's byte-code representation and thus represent a single String object at run-time.
String name = new String("Sambhav");
String myName= new String("Sambhav");
Those are different Objects explicitly, a new String Object will created for each call, though it could reuse char array of the underlying string (the one you provide in constructor). This happens due to new keyword that envisages Java to create a new object. And that is why name != myName in that case, even though name.equals(myName)

String name = new String("Sambhav");
String myName = new String("Sambhav");
Will Java still be able to catch that the string
are the same and just point myName to the same object as created in
the previous statement?
The JVM manages to keep only one reference of equal String objects by computing a hash.
Those String objects are kept in a String pool.
String pooling
String pooling (sometimes also called as string canonicalisation) is a process of replacing several String objects with equal value but different identity with a single shared String object.
You can achieve this goal by keeping your own Map<String, String> (with possibly soft or weak references depending on your requirements) and using map values as canonicalised values.
Or you can use String.intern() method which is provided to you by JDK.
Quick string pool differences by JVM version
In Java 6, this String pool was located in the Perma Gen memory. This memory is usually small and limited. Also, here the String.intern() shouldn't be used because you can run out of memory.
In Java 7 and 8 it was taken out to the heap memory and implemented with a hash-table like data structure.
Since hash-table like structures (HashMap, WeakHashMap) use a computed hash to access the entry in constant complexity, the entire process is very fast.
As mentioned in this article:
Stay away from String.intern() method on Java 6 due to a fixed size memory area (PermGen) used for JVM string pool storage.
Java 7 and 8 implement the string pool in the heap memory. It means that you are limited by the whole application memory for string pooling in Java 7 and 8.
Use -XX:StringTableSize JVM parameter in Java 7 and 8 to set the string pool map size. It is fixed, because it is implemented as a hash map with lists in the buckets. Approximate the number of distinct strings in your application (which you intend to intern) and set the pool size equal to some prime number close to this value. It will allow String.intern() to run in the constant time and requires a rather small memory consumption per interned string (explicitly used Java WeakHashMap will consume 4-5 times more memory for the same task).
The default value of -XX:StringTableSize parameter is 1009 in Java 7 and around 25-50K in Java 8.

You are actually showing 3 different reasons why the Strings may use the same buffer internally. Note that sharing a buffer is only possible for separate instances because they are immutable; otherwise changes in the buffer would be reflected in the other variable values as well.
Compiler detects identical String literals; if the string literal is repeated the compiler may simply point to the same object instance;
References to a String are pointing to the same object instance and are therefore identical by definition;
Buffer sharing may help during construction with new. If the runtime system sees that String contents may be shared then it may opt to do so; this behavior is however not guaranteed - it's implementation specific. The object instances should be different (but using them as separate instances would still not be wise).
As an example for #3, Java 6 OpenJDK source simply will point to the same buffer. If the buffer is larger than the new String instance, a copy will be created. Those are different Objects explicitly, a new String Object will created for each call, though it could reuse char array of the underlying string (the one you provide in constructor) so that the Garbage Collector can clear the larger string (otherwise the larger character buffer may be kept in memory indefinitely).
This all should not matter too much to you, unless you get careless and start using == for equality (or other constructs that confuse == with equals).

Integer and Integer Array storage on stack/heap

I'm curious to know how Integer and Integer Array are stored on the stack/heap in java, is there a link someone could point me to? Or could someone explain it to me please.
Update 1:
and how does this affect the way an integer and an integer array are passed as arguments to methods in Java.
Thank You

Whenever you declare a variable within a local scope(a method) it gets put on the stack.
That is: Type myVariable will push space for a new variable onto that methods stack frame, but it's not usable yet as it's uninitialized.
When you assign a value to the variable, that value is put into the reserved space on the stack.
Now here's the tricky part. If the type is primitive, the value contains the value you assigned. For example, int a = 55 will literally put the value 55 into that space.
However, if the type is non primitive, that is some subclass of Object, then the value put onto the stack is actually a memory address. This memory address points to a place on the heap, which is where the actual Object is stored.
The object is put into the heap on creation.
An Example
private void myMethod()
{
Object myObject = new Object();
}
We're declaring a variable, so we get space on the stack frame. The type is an Object, so this value is going to be a pointer to the space on the heap that was allocated when the Object was created.

Variables contains only references to this objects and this references stored in stack in case of local variables, but data of objects the point to stored in heap.
You can read more, for example, here: link

method variables are stored in Stack. Objects , in the other hand are stored in the Heap as the image below demonstrates it.
That's why if you get StackOverFlowException, that means you have declared too many variables in a method or you are calling too many methods in a recursive call. And if you get Java Heap Space Error, that means you are creating more objects than you do.
For Stack and Heap explanation, I recommend this link

what is the size of empty class in C++,java?

What is the size of an empty class in C++ and Java?
Why is it not zero?
sizeof(); returns 1 in the case of C++.

Short Answer for C++:
The C++ standard explicitly says that a class can not have zero size.
Long Answer for C++:
Because each object needs to have a unique address (also defined in the standard) you can't really have zero sized objects.
Imagine an array of zero sized objects. Because they have zero size they would all line up on the same address location. So it is easier to say that objects can not have zero size.
Note:
Even though an object has a non zero size, if it actually takes up zero room it does not need to increase the size of derived class:
Example:
#include <iostream>
class A {};
class B {};
class C: public A, B {};
int main()
{
std::cout << sizeof(A) << "\n";
std::cout << sizeof(B) << "\n";
std::cout << sizeof(C) << "\n"; // Result is not 3 as intuitively expected.
}
g++ ty.cpp
./a.out
1
1
1

In the Java case:
There is no simple way to find out how much memory an object occupies in Java; i.e. there is no sizeof operator.
There are a few ways (e.g. using Instrumentation or 3rd party libraries) that will give you a number, but the meaning is nuanced1; see In Java, what is the best way to determine the size of an object?
The size of an object (empty or non-empty) is platform specific.
The size of an instance of an "empty class" (i.e. java.lang.Object) is not zero because the instance has implicit state associated with it. For instance, state is needed:
so that the object can function as a primitive lock,
to represent its identity hashcode,
to indicate if the object has been finalized,
to refer to the object's runtime class,
to hold the object's GC mark bits,
and so on.
Current Hotspot JVMs use clever tricks to represent the state in an object header that occupies two 32 bit words. (This expands in some circumstances; e.g. when a primitive lock is actually used, or after identityHashCode() is called.)
1 - For example, does the size of the string object created by new String("hello") include the size of that backing array that holds the characters? From the JVM perspective, that array is a separate object!

Because every C++ object needs to have a separate address, it isn't possible to have a class with zero size (other than some special cases related to base classes). There is more information in C++: What is the size of an object of an empty class? .

Because an object has to have an address in memory, and to have an address in memory, it has to occupy "some" memory. So, it is usually, in C++, the smallest possible amount, i.e. 1 char (but that might depend on the compiler). In Java, I wouldn't be so sure.. it might have some default data (more than just a placeholder like in C++), but it would be surprising if it was much more than in C++.

C++ requires that a normal instantiation of it have a size of at least 1 (could be larger, though I don't know of a compiler that does that). It allows, however, an "empty base class optimization", so even though the class has a minimum size of 1, when it's used as a base class it does not have to add anything to the size of the derived class.
I'd guess Java probably does pretty much the same. The reason C++ requires a size of at least 1 is that it requires each object to be unique. Consider, for example, an array of objects with size zero. All the objects would be at the same address, so you'd really only have one object. Allowing it to be zero sounds like a recipe for problems...

It's defined by the C++ standard as "a nonzero value", because an allocated object must have a nonzero size in order to have a distinct address. A class that inherits from an empty class, however, is not required to increase in size, barring the usual increase of a vtable if there are virtual functions involved.

I don't know if there is a sizeof() operator in java. What you can do is create an instance of the empty class (have it serializable), send it through a PipedOutputStream and read it as byte array - byteArray.length gives you the size.
Alternatively, write out the instance to a file using DataOutputStream, close the File, open it and file.length() will give you the size of the Object. Hope this helps, - M.S.

As others have pointed out, C++ objects cannot have zero size. Classes can have zero size only when they act as a subclass of a different class. Take a look at #Martin York's answer for a description with examples --and also look and vote the other answers that are correct to this respect.
In Java, in the hotspot VM, there is a memory overhead of 2 machine-words (usually 4 bytes in a 32 arch per word) per object to hold book keeping information together with runtime type information. For arrays a third word is required to hold the size. Other implementations can take a different amount of memory (the classic Java VM, according to the same reference took 3 words per object)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.