management of the objects - java

I have a doubt with the management of the objects using java or c++.
The case is, in c++, when you want to create a dynamic object, one that survive more than the block scope on where it is created, you have to do a new and you will receive a pointer. Otherwise, if you just want to use this object in the block scope, you don't need to create it using new...
But in Java, you always have to create them using new, because if not, the object is null and you can use it.
Why is that? Is it just how it works?
Thanks

The best analogy I can think of, is that all types in C++ behave somewhat like primitives in Java. If you declare a primitive in Java, you don't have to use new, you can just use the variable right away. But such a primitive, much like most objects in C++, will only survive the current scope. In C++, if you want an object to exist outside of the current scope, you need to tell this to your compiler, because it will have to allocate memory on the heap instead of the stack. You can do this by using new. In Java, all objects (save primitives) are allocated on the heap, the only data on the stack are references to heap memory, and primitives. Therefor, in Java, all memory allocations are done using new.
The above is a simplification of the actual memory management in Java. For a more thorough discussion on stack/heap memory regarding primitives, take a look here.

This difference is becaase Java is using a garbage collector for memory management. Since the garbage collector automatically deallocates objects when their scope ends (and it has no reachable reference), there is no need to have two different methods for creating objects.
You can say that objects in Java automatically behaves like objects in C++ which are initialized without new, in that you don't have to think about deleting them.

Basically, that's just how it works. Once the new keyword is used then the Object is created and popped onto the heap. If you do not reference the object outside of a method then it will be automatically reclaimed by the garbage collector.
I suggest that you do some reading around the basics of the Java heap and garbage collection to get a better understanding. There are plenty of resources out there. I always recommend the head first books for new comers.

In C++, anything can be allocated on a stack (which is what happens when you say
ObjectType o;
in C++.
In Java, only primitives are really allocated on the stack. Objects are never on the stack (It's just how it is). When you say
ObjectType o;
in Java, no object is allocated, only a "variable". A variable can have a reference to an object, but at the moment it has none. In essence, it's the same thing as saying
ObjectType *o = NULL
in C++.
In order to actually allocate an object for this reference to refer to, you have to use new in Java.

The case is, in c++, when you want to create a dynamic object, one that survive more than the block scope on where it is created, you have to do a new and you will receive a pointer.
The new operator in C++ allocates space on the heap. The heap is where the larger part of the main memory is. If you do this, you are responsible for freeing that space when you're done with it using the free operator.
Otherwise, if you just want to use this object in the block scope, you don't need to create it using new...
When you declare variables in C++, memory is allocated on the stack. The stack is where local data is stored and everything you push(add) on it while executing a function will be automatically popped (removed) when the function returns. The stack is usually a lot smaller than the heap, but there are advantages to using it: you don't need to worry about memory management, it is faster, etc.
But in Java, you always have to create them using new, because if not, the object is null and you can use it.
When you declare variables in Java, they are again stored on the stack
As you know, you don't call new on primitive data types (e.g. int i = new int(3);). When you do something like Object x; you declare that x would be a reference to an object of type Object. However, you do not assign a value to it, so the reference is null (not the object, because there isn't one).
The new operator in Java, roughly speaking, allocates space on the heap, calls the constructor of the object that it is invoked on, and returns a reference to the constructed object. The difference with C++ is that you don't need to free the object yourself - there is a garbage collector. In essence, what it does is that it monitors how many references point to an object, and if they go down to zero it deletes the object automatically.
So when you do Object y = new Object();
x = y; you will get two references (x and y) pointing to the same object. When you have a function call like thisObject foo() {
Object y = new Object();
return y;
}
void bar() {
Object x = foo();
...
} in the ... part of bar() you would have the reference x, pointing to the object created in foo(). Since foo has returned, the y reference has been freed, thus there would be only one reference to this object in the ... part of the program. If you don't copy the x reference anywhere in bar and bar returns, then there would be 0 references to the object, and the garbage collector would collect it (although not immediately).
-Stan

Related

declaring objects in C++ vs java

I have been using c++ for a while now and I am learning java,
declaring objects in java is confusing me,
In java we write
myclass myobject = new myclass();
myobject.mymethod();
Is it same as this code in c++ ?
myclass *myobject = new myclass();
myobject->mymethod();
i.e is the memory getting allocated on heap? If it is on heap why we never free the memory. I believe the new keyword is the same.
If so, how do we allocate memory on stack?
Is it same as this code in c++ ?
Yes. It's the same.
i.e is the memory getting allocated on heap?
Yes it.
If it is on heap why we never free the memory.
The object is allowed to garbage collector when it is no more reachable. i.e when there are no valid reference to that object or (de-referenced)
If so, how do we allocate memory on stack?
When the particular thread execution starts, variables related to that thread will be placed on stack and will be remove immediately once the job of that thread finished. Every thread has it's own stack.
As you thought, the new operator allocates a new object on the heap. Memory in Java is not freed explicitly - once an object has no more access roots, it is eligible for being freed. Periodically, a garbage collection thread will free this memory.
Whilst it is not inaccurate to say that this C++ code is equivalent:
myclass* myobject = new myclass();
myobject->mymethod();
It is also not exactly the same.
Java has a Garbage Collector and so, as you note, you do not have to free the object in Java.
So a closer approximation to the original Java might be this:
std::shared_ptr<myclass> myobject = std::make_shared<myclass>();
myobject->mymethod();
Now you do not have to deallocate myobject it gets garbage collected when there are no longer any references to it.
However it would be a mistake in C++ to use std::shared_ptr for every heap-allocated object because it really drags down the performance.
As a rule it is better to manage the heap allocated object in one place using a std::unique_ptr. If it is impossible to know which component will be the last to de-reference the object, a std::shared_ptr should be used in each place.
However when calling down to functions from the component that holds the smart pointer you should pass the raw pointer or a reference:
std::shared_ptr<myclass> myobject = std::make_shared<myclass>();
myobject->mymethod();
ptr_needing_func(myobject.get()); // pass raw pointer using get()
ref_needing_func(*mtobject.get()); // pass reference using *get()
This way you don't lose any efficiency while still maintaining the safety and convenience of garbage collecting smart pointers.
See: CppCoreGuidlines: R.23
After reading other answers to this question and some other articles, I understood that,
Both the c++ and java code are doing the very similar thing except the syntax is different and java is using references instead of pointers(Java doesn't have pointer).
Here,
myclass myobject; is a declaration of myobject,
Declarations simply notify the compiler that we will be using myobject to refer to a variable whose type is myclass. It is not allocating the memory.
new myclass(); is instantiating the object (allocating the memory in the heap) and returning the reference to it.
It is also initializing the object by calling the constructor myclass().
Clarification of a very basic doubt,
int i; ==> Declaring the object and allocating memory for it in stack.
myclass myobject;==> Only declaring the reference variable for the object(It also takes 4 bytes or 8 bytes depending on system). It does not allocate actual memory where the instance variables will be stored.
In other words, memory is allocated while declaring for the primitive data types but not for the non-primitive data type. For non-primitive data types we need to allocate them using new keyword.
Why we never free the memory?
Java has garbage collector that does it for us automatically.
How do we allocate memory for objects in stack?
We can't. Only primitive data types can be stored in stack.

Stack vs Heap in C/Java

Here's my understanding.
In C programming, if I do int a then that a is created on stack and thus the memory is taken from stack. Heap plays no part here.
But if I do something like
int *a;
a=(int*)malloc(sizeof(int));
and dynamically allocate the memory, then the reference variable will be placed on stack, but the memory it points to will be on the heap.
Am I correct with my understanding?
Now, I picked up this book on java that says
Whenever you need an object, you
simply write the code to create it by using new, and the storage is allocated on the
heap when that code is executed.
So there's no way of creating objects on Stack in Java?
I guess, the primitive data types can still be placed on stack, but I am concerned about the Objects.
There is no way to create objects on the stack in Java. Java also has automatic garbage collection, so you don't have any way of deleting objects. You just let all references to them go out of scope and eventually the garbage collector deals with them.
That is correct. Objects are stored on the heap. The stack contains primitive values like int and double (from local variables) and references to objects (again from local variables).
The whole premise of your question is false: in Java you don't have any control over where the objects will be allocated. Some are indeed stack-allocated, but you'll never notice the difference.
What is fundamentally different between Java and C is that in Java the value of a variable can never be the object itself, whereas in C the value can be the struct itself, with no indirection. You can pass such structs by value to other functions and there is no equivalent of that in Java.

Why are reference types stored in heap

I do know that in Java, (perhaps in .net too) , primitives are stored on stacks , where as reference types are stored on heaps.
My question was that I do not understand the proc/cons for this behavior. Why can't we reference a memory location inside our stacks instead? . I couldn't find a proper explanation as I googled ( maybe I suck at it) , but if you people can provide some insights I would be grateful
Thanks.
I do know that in Java, (perhaps in .net too) , primitives are stored on stacks , where as reference types are stored on heaps.
No. It does not depend on whether its a primitive or a reference. It depends on the scope whether the stack or the heap is used. Local variables are allocated on the stack, member variables are allocated on the heap when the object is instantiated.
See also Do Java primitives go on the Stack or the Heap?
My question was that I do not understand the proc/cons for this behavior.
Data stored on the stack only lives as long as your method is executing. Once the method is done, all data allocated on the stack is removed.
Data stored on the heap lives as long as it is not discarded (which, in case of Java, is done in the background by the garbage collector). In other languages as C/C++, you explicitly need to delete/free data which was allocated on the heap.
Consider the following code snippet:
String someMethod() {
int i = 0;
String result = "Hello";
i = i + 5;
return result;
}
Here, a primitive (int i) is created on the stack and some calculation is done on it. Once the method finishes, i cannot be accessed anymore, and its value is lost. The same is basically true for the result reference: the reference is allocated on the stack, but the Object (a String object in this case) is allocated on the Heap. By returning the reference as return value, the object it references can still be used outside the method.
You can't generally store reference types on stack because the stack frame is destroyed upon method return. If you saved a reference to an object so it can be dereferenced after the method completes, you'd be dereferencing a non-existent stack location.
The HotSpot JVM can perform escape analysis and, if it determines that an object cannot possibly escape the method scope, it will in fact allocate it on the stack.
where as reference types are stored on heaps.
I don't know what exactly you mean by that part, but remember that, only objects are stored on heap, whereas, references pointing to those objects are still on the stack. Probably this was the doubt you had.
Now, you should also note that, only local variables are stored on stack, whereas instance / member variables are stored on Heap.
For e.g.: -
String str = new String("Rohit"); // Local variable
In above case, str reference will be allocated memory on stack, if of course it is defined in some local scope. And it will point to a new string object created on Heap.
Why can't we reference a memory location inside our stacks instead?
You can but think of this decision as Memory Architecture decision.
By concept, ideally, any data can't be retrieved from stack if it is not on top of it. But in real world you require some location to be accessed from anywhere in the program. So, it can't be stack. and they named it heap.
This link may throw more light on it.

is there a way to recycle a complex java object once the GC has decided it is unreachable

In C++ I use reference counted objects to impplement a for of "auto" recycling object pool
SmartPointer<ObjType> object = pool.getObject(); // hold reference
// ... do stuff with object over time.
object = nullptr; // that is when reference
// count goes to 0
-- Now I have on the C++ objects an "onFinalRelease()" method which gets called when the refcount reaches 0. I can overide this ( default is delete(this) ) to auto-recycle objects rather than destroying them.
The question is can I implement this pattern with some combination of java reference types and reference pools. Of course this is for a type of large complex expensive to create object where it makes sense. That is I want to do:
SomeReference r = referenceQueue.getReference();
pool.recycle(r.takeBackUnusedObjectFromGC()); // ??????????????????????????
This would be real nice :)
You can use PhantomReferences to do something like this. Have an interface (proxy) object with a (strong, unidirectional) reference to the expensive object. Also keep a strong reference to the expensive object in your pool management. Keep a PhantomReference to the interface object. Once the PhantomReference comes up on its ReferenceQueue you know for sure that the expensive object is not being used through an interface object (even allowing for finalisation). The expensive object can now be reused with a new interface object.
However, it probably isn't worth it.
With reference counting, there is a clearly defined time when an object becomes garbage - when the reference count goes to zero. With Java's garbage collection, there is no guarantee that a given object will ever be garbage collected, even if there are no more strong references to it.
Implementing your own reference counter by hand is the best solution I can think of.
Java has something similar called the finalize method. Unfortunately, once it runs for an object, there's no going back. In addition, it's not even guaranteed to run.
You best bet might be to create a pool of objects and track yourself whether they can be reused or not. Apache Commons Pool might be useful for this.
This class may be what you're looking for:
https://commons.apache.org/proper/commons-pool/apidocs/org/apache/commons/pool2/impl/SoftReferenceObjectPool.html

Does variable = null set it for garbage collection

Help me settle a dispute with a coworker:
Does setting a variable or collection to null in Java aid in garbage collection and reducing memory usage? If I have a long running program and each function may be iteratively called (potentially thousands of times): Does setting all the variables in it to null before returning a value to the parent function help reduce heap size/memory usage?
That's old performance lore. It was true back in 1.0 days, but the compiler and the JVM have been improved to eliminate the need (if ever there was one). This excellent IBM article gets into the details if you're interested: Java theory and practice: Garbage collection and performance
From the article:
There is one case where the use of explicit nulling is not only helpful, but virtually required, and that is where a reference to an object is scoped more broadly than it is used or considered valid by the program's specification. This includes cases such as using a static or instance field to store a reference to a temporary buffer, rather than a local variable, or using an array to store references that may remain reachable by the runtime but not by the implied semantics of the program.
Translation: "explicitly null" persistent objects that are no longer needed. (If you want. "Virtually required" too strong a statement?)
The Java VM Spec
12.6.1 Implementing Finalization
Every object can be characterized by two attributes: it may be reachable, finalizer-reachable, or unreachable, and it may also be unfinalized, finalizable, or finalized.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Discussion
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.
The object is reachable if it can be involved in any potential continuing computation. So if your code refers to a local variable, and nothing else refers to it, then you might cause the object to be collected by setting it to null. This would either give a null pointer exception, or change the behaviour of your program, or if it does neither you didn't need the variable in the first place.
If you are nulling out a field or an array element, then that can possibly make sense for some applications, and it will cause the memory to be reclaimed faster. Once case is creating a large array to replace an existing array referenced by a field in a class - if the field in nulled before the replacement is created, then it may relieve pressure on the memory.
Another interesting feature of Java is that scope doesn't appear in class files, so scope is not relevant to reachability; these two methods create the same bytecode, and hence the VM does not see the scope of the created object at all:
static void withBlock () {
int x = 1;
{
Object a = new Object();
}
System.out.println(x+1);
}
static void withoutBlock () {
int x = 1;
Object a = new Object();
System.out.println(x+1);
}
Not necessarily. An object becomes eligible for garbage collection when there are no live threads anymore that hold a reference to the object.
Local variables go out of scope when the method returns and it makes no sense at all to set local variables to null - the variables disappear anyway, and if there's nothing else that holds a reference the objects that the variables referred to, then those objects become eligible for garbage collection.
The key is not to look at just variables, but look at the objects that those variables refer to, and find out where those objects are referenced by your program.
It is useless on local variables, but it can be useful/needed to clear up instance variables that are not required anymore (e.g. post-initialization).
(Yeah yeah, I know how to apply the Builder pattern...)
That could only make some sense in some scenario like this:
public void myHeavyMethod() {
List hugeList = loadHugeListOfStuff(); // lots of memory used
ResultX res = processHugeList(hugeList); // compute some result or summary
// hugeList = null; // we are done with hugeList
...
// do a lot of other things that takes a LOT of time (seconds?)
// and which do not require hugeList
...
}
Here it could make some benefit to uncomment the hugeList = null line, I guess.
But it would certainly make more sense to rewrite the method (perhaps refactoring into two,
or specifying an inner scope).
Setting an object reference to null only makes it eligible for garbage collection.
It does not necessarily free up the memory,which depends on when the garbage collector runs(which depends on JVM).
When the garbage collector runs,it frees up the heap by deleting only the objects which are eligible for garbage collection.
It is a good to have. When you set objects to null, there is a possibility that the object can be garbage collected faster, in the immediate GC cycle. But there is no guaranteed mechanism to make an object garbage collected at a given time.

Categories

Resources