l-value and r-value, stack and heap

l-value and r-value, stack and heap - java

I am a high school student learning Java (in BlueJ environment).
Context
My book, while discussing pass by value and pass by reference mechanisms, uses the terms, stack and heap and also states that each unit in the memory (also known as variable) has a name, l-value and r-value where l stands for 'locator' or 'location' and r stands for 'read'. The name is used to identify the unit, the l-value stores the address of the unit and the r-value stores the actual value of the unit. In case of primitive datatypes, it store the actual value while in case of reference datatypes it store the address of the reference datatype which refers or points to it. When a function with parameters is called, the r-value of the actual parameters are copied into the r-value of the actual parameter. In case of primitive datatypes the actual value is copied while in case of reference datatypes the reference address is copied due to which, in the former case there is no change in the actual values while in the latter case there is change in the actual values.
My Questions
Now, i decided to learn more about this on the Internet. I found that the discussions on the Internet are not in conformity to my book. There l-value and r-value are said to be the value to the left hand side and the right hand side respectively of the assignment sign. I am confused.
What is the actual meaning of l-value and r-value and what does my book mean by stack, heap (I want a simple and easy to understand answer) and unit of memory. I found many questions on this site dealing with stack and heap but could not understand the answers there as they were very technical and as such i do not have so much of a technical knowledge. Also i would like to know where i can learn more about this
Here are the pages from my textbook:

When the terms l-value and r-value were first coined, l and r
indeed meant left and right. That is, l-value originally meant left
hand side of the assignment and r-value meant right hand side of the
assignment. However, later on, they were revised to indicate
'locator' and 'read' respectively as your book suggests. The reason
was that programming languages like C have many operators (e.g.
address of operator &) where the operand that appears to the right
hand side of the operator is still an l-value.
stack and heap are areas in the memory. Stack is used to store
local variables and function calls. Heap is used to store objects.
Heap is shared by all threads of your application while stack is
assigned to each thread.

Stack and heap:
Stack
The stack is simply a specific range of memory which each program starts with. Every program has a stack and when that programming is running, the CPU actually stores a pointer to 1. where the 'top' of the stack is.
When a function is called, code (produced by a compiler) writes values (copies of parameters, and also the return address of the code that called the function) to the point in memory which is referred to by this stack pointer (SP). It then modifies the SP to point a bit further along, to the point after the parameters.
When your function returns, it writes the return value to the point in memory pointed to by the SP, and then Jumps the code execution back to the code that called the function. That code then copies the return value from the SP location, and decrements the SP.
This area is called a stack because the program copies values onto it when either 1. you declare local variables Or 2. You call functions with parameters.
And then it 'pops' the parameters and local variables off when returning from a function.
(This is how it works in theory. In practice, the compiler will instead write instructions to copy the values to CPU registers instead, where it can. And also the return value).
Heap
The heap simply refers to all other memory which is allocated by the program, usually by a system call (brk in linux) (called by malloc in C). A program can have many chunks of memory which it has asked the operating system to allocate to it. These chunks of memory (as a whole) are called the heap.
In java:
when you use the 'new' keyword, what it does is give you back a pointer to some memory which it asked the operating system to give it.
when you declare a variable not using new, what the compiled code will do is simply use the existing memory at the top of the Stack memory area, and then change the stack pointer.
When you use a pointer variable and assign it an object you create with new ExampleObject(), you are actually doing both things. In this case the pointer (reference) variable will be created at the Stack Pointer location. Then the Stack Pointer is moved (added to by 8 bytes, the size of the pointer value), then the new() function will obtain the new memory reference from the heap area and then the value of that reference will be copied into the local pointer variable.
In practice a language like Java when it is executing a program, starts with a certain size stack and a certain size of memory (called heap) already allocated to it by the operating system and will only ask for more memory when it is running out of space
It would be worth your time to read on how CPUs work, in particular how they have Registers which store values, and one of them is the Stack Pointer. Also how they perform additions and subtractions. This is important because they do not (usually) for example when Adding, add a number from referenced address in memory to a number in another referenced address. If you look at Assembly instructions (similar to java byte code) What they do more often is more like:
For example a function int addnum(int a, int b) {return a+b;}
a. Load the number from SP ie where the SP is pointing, into Register 1
b. Load the number from just before where the SP is pointing (SP-1) into register 2
c. Call Add CPU instruction, which stores the result in register R3
d. Copy the R3 value to SP+1
Which might look like
Calling code like this: (note, these are made up example CPU instructions - they are different for each CPU and Java has its own bytecode which is similar. I'm just using for example STORESP => write to stack, LOADSP => load from stack pointer)
int x;
x = addnum(9,6);
INCSP +1 #allocate x at location SP and increment SP by 1
# start function call
# make 3 spaces, for a, b, and b and return value
INCSP +3 #add 3 to SP register
STORESP 9,0 # copy 9 value to SP-0
STORESP 6,-1 # copy 6 value to SP-1
JUMP addnum # jump to executing the function code
Then, the function itself
LOADSP,0,R1 #copy from SP-0 (a) into reg 1
LOADSP,-1,R2 #copy from SP-1(b) into reg 2
ADDREG,R1,R2,R3 # add reg1 reg2 and store in R3
STORESP,R3,-2 #save the result to SP-2
RETURN
Then calling function again:
Stores the result in x (copy SP-2) (to SP-3)
LOADSP,-2,R1
STORESP,R1,-3
Now the function call is done. So throw away the space allocated on the stack for a and b and the return value
(By decrementing SP by 3)
ADDSP -3
And now the result, is in 'x'
Of course this is much simplified and not accurate but is only for an example to help understand.
But if you can look at how these low level things work, just to do something basic like add two numbers, then it will help you to understand "where" and "how" parameters are passed in functions, and exactly how important the stack concept is
Good luck

Related

Where does the reference to the primitive variable get stored in java? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 months ago.
Improve this question
I know that primitive values are stored in stack and non primitive values are stored in heap. As far as I know the stack also contains reference to the non primitive values. Now my doubt is where do we store reference to the primitives? i.e, for example,
int a=10;
Now the value 10 gets stored in stack as far as I know, but my question is where does the value a get stored?
Also please suggest some good resources to learn memory management in Java. I've read tons of stuff online, but none of them are clear to me.

So it's not correct say that primitives are stored (only) on the stack and references stored (only) on the heap. First, some code:
public static main( String args... ) {
Widget w = new Widget();
}
class Widget {
private int i = 10;
private String name = "FooBar";
}
So in main, the reference w is stored (canonically) on the stack. The instance the reference points to is stored on the heap.
Within that instance, there's a variable i that's stored on the heap, because it's part of the Widget instance. There's also a reference to another object, name that points to a second object instance "FooBar" that is also stored on the heap.
So here is an example of a reference w on the stack, and a reference name on the heap. They can be stored in either location.
Ditto with primitives. I think you understand how primitives get stored on the stack. So I had Widget store a primitive on the heap to show how that works.
(All that said, the "stack" can be kept on the heap, if the compiler or JVM/OS really want to, and small objects can be allocated on the stack for speed, so it's really a bit more complicated. But ignoring tricky stuff, normally we say that the stack and the heap work like above.)
Also please suggest some good resources to learn memory management in Java. Take a systems programming course or two at your university (typically one on compilers/linkers and one on "how to write an operating system"). If you're not at a university, go to university. Some stuff they don't teach by reading some random Wordpress blog.
where does the value "a" get stored? "a" is just a label used by the source text, it gets effectively erased at runtime. The compiler (the runtime byte codes or assembly language) just uses a number to refer to the variable, that number is just the offset (or position) on the stack. It's just like accessing an array of numbers. The stack is like an int array, stack = new int[100] or whatever, and each local variable is just a number, an offset, into that array. This includes references too, which are stored at an offset in the stack and themselves just "point" to heap memory by being the address (as a number) of the location in memory of the object. It gets more complicated because the stack can also have bytes and char (16 bits) and long, but the CPU and assembly language can handle that fine.

What happens(with respect to memory) when two int variables are given same value in java?

I know that local variables are stored on stack in java. But what happens with respect to memory allocation on stack when two int variables are given same value in java(How are they related)? Is there any kind of copy on write semantic? How does it works then?

The obvious answer is that int x=5; int y=5; does the same as int x=5; int y=6; with the only exception being that, in your case, the memory that's been associated with y will look the same as as it does for x.
It's difficult to envisage a JVM where some kind of copy on write semantic is used - i.e. x and y are allocated the same piece of memory until one of them changes the value. As far as I know, that is theroetically permissible by the Java Language Specification, but it's unlikely to be used by anything so trivial as an int: the overhead in setting up the copy on write would far outweigh them having different memory from the outset.
(Out of interest, it was possible to write copy-on-write semantics for the C++ std::string class, but since C++11 it's been disallowed).

It will definitely create to variable in stack. which hold value 5.
There's no other way.
Any variable local/global,primitive/non-primitive stores some value
For primitive value is some whatever you assign according to type.
For non primitive value is memory address of the object in heap but in the end both holds value and does not lookup if some other var has the same either way.

Where is "null" in memory

In java, you cannot state an array's size in its declaration
int[5] scores; //bad
I'm told this is because the JVM does not allocate space in memory until an object is initialized.
If you have an instance array variable (auto initialized with a default value of null), does that variable point to a place in the heap indicating null?

A null reference is literally a zero-value. The operating system prevents any program from accessing the zero address in memory, so the JVM will proactively check and make sure that a reference value isn't zero before allowing you to access it. This lets the JVM give you a nice NullPointerException rather than a "The program has performed an Illegal Operation" crash.
So you could say that the variable "points to" an invalid heap location. Or you could just say the variable doesn't "point to" anything. At that point it's just a question of semantics.

No, because in the JVM there's no need for that.
If you're in a native language (C and C++, for instance), NULL is a pointer with a zero value, and that points to the memory base address. Obviously that's not a valid address, but you "can" dereference it anyway - especially in a system without protected memory, like old MS-DOS or small ones for embedded processors. Not that it would be a valid address - usually that location contains interrupt vectors and you shouldn't touch them. And of course in any modern OS that will raise a protection fault.
But in the JVM a reference to an object is more like a handle (i.e. an index in a table) and null is an 'impossible' value (an index that is outside the domain of the table), so it can't be dereferenced and doesn't occupy space in such table.

I think this post will answer your question - Java - Does null variable require space in memory
In Java, null is just a value that a reference (which is basically a
restricted pointer) can have. It means that the reference refers to
nothing. In this case you still consume the space for the reference.
This is 4 bytes on 32-bit systems or 8 bytes on 64-bit systems.
However, you're not consuming any space for the class that the
reference points to until you actually allocate an instance of that
class to point the reference at.
Edit: As far as the String, a String in Java takes 16 bits (2 bytes)
for each character, plus a small amount of book-keeping overhead,
which is probably undocumented and implementation specific.
(remember to upvote the answer in the link if it helps you out)

Nope...you'll have a null (0x00) reference in the object's variable.

I would argue that "int[5] scores; //bad" is not due to memory allocation.
Notice that when you declare something you are really declaring
Type ReferenceName = new Type()
typically.
Observe the two examples
int[] scores = new int[5];
JLabel label = new JLabel();
The types (on the left hand side) are int[] and JLabel, which have nothing to do with memory allocation (except for a pointer), while the new instances (on the right side, requiring memory allocation) are int[5], requiring space for 5 ints, and JLabel(), requiring no arguments to call the constructor, but memory enough for a JLabel.

what is the size of empty class in C++,java?

What is the size of an empty class in C++ and Java?
Why is it not zero?
sizeof(); returns 1 in the case of C++.

Short Answer for C++:
The C++ standard explicitly says that a class can not have zero size.
Long Answer for C++:
Because each object needs to have a unique address (also defined in the standard) you can't really have zero sized objects.
Imagine an array of zero sized objects. Because they have zero size they would all line up on the same address location. So it is easier to say that objects can not have zero size.
Note:
Even though an object has a non zero size, if it actually takes up zero room it does not need to increase the size of derived class:
Example:
#include <iostream>
class A {};
class B {};
class C: public A, B {};
int main()
{
std::cout << sizeof(A) << "\n";
std::cout << sizeof(B) << "\n";
std::cout << sizeof(C) << "\n"; // Result is not 3 as intuitively expected.
}
g++ ty.cpp
./a.out
1
1
1

In the Java case:
There is no simple way to find out how much memory an object occupies in Java; i.e. there is no sizeof operator.
There are a few ways (e.g. using Instrumentation or 3rd party libraries) that will give you a number, but the meaning is nuanced1; see In Java, what is the best way to determine the size of an object?
The size of an object (empty or non-empty) is platform specific.
The size of an instance of an "empty class" (i.e. java.lang.Object) is not zero because the instance has implicit state associated with it. For instance, state is needed:
so that the object can function as a primitive lock,
to represent its identity hashcode,
to indicate if the object has been finalized,
to refer to the object's runtime class,
to hold the object's GC mark bits,
and so on.
Current Hotspot JVMs use clever tricks to represent the state in an object header that occupies two 32 bit words. (This expands in some circumstances; e.g. when a primitive lock is actually used, or after identityHashCode() is called.)
1 - For example, does the size of the string object created by new String("hello") include the size of that backing array that holds the characters? From the JVM perspective, that array is a separate object!

Because every C++ object needs to have a separate address, it isn't possible to have a class with zero size (other than some special cases related to base classes). There is more information in C++: What is the size of an object of an empty class? .

Because an object has to have an address in memory, and to have an address in memory, it has to occupy "some" memory. So, it is usually, in C++, the smallest possible amount, i.e. 1 char (but that might depend on the compiler). In Java, I wouldn't be so sure.. it might have some default data (more than just a placeholder like in C++), but it would be surprising if it was much more than in C++.

C++ requires that a normal instantiation of it have a size of at least 1 (could be larger, though I don't know of a compiler that does that). It allows, however, an "empty base class optimization", so even though the class has a minimum size of 1, when it's used as a base class it does not have to add anything to the size of the derived class.
I'd guess Java probably does pretty much the same. The reason C++ requires a size of at least 1 is that it requires each object to be unique. Consider, for example, an array of objects with size zero. All the objects would be at the same address, so you'd really only have one object. Allowing it to be zero sounds like a recipe for problems...

It's defined by the C++ standard as "a nonzero value", because an allocated object must have a nonzero size in order to have a distinct address. A class that inherits from an empty class, however, is not required to increase in size, barring the usual increase of a vtable if there are virtual functions involved.

I don't know if there is a sizeof() operator in java. What you can do is create an instance of the empty class (have it serializable), send it through a PipedOutputStream and read it as byte array - byteArray.length gives you the size.
Alternatively, write out the instance to a file using DataOutputStream, close the File, open it and file.length() will give you the size of the Object. Hope this helps, - M.S.

As others have pointed out, C++ objects cannot have zero size. Classes can have zero size only when they act as a subclass of a different class. Take a look at #Martin York's answer for a description with examples --and also look and vote the other answers that are correct to this respect.
In Java, in the hotspot VM, there is a memory overhead of 2 machine-words (usually 4 bytes in a 32 arch per word) per object to hold book keeping information together with runtime type information. For arrays a third word is required to hold the size. Other implementations can take a different amount of memory (the classic Java VM, according to the same reference took 3 words per object)

How is reference to java object is implemented?

Is pointer is just used for implementing java reference variable or how it is really implemented?
Below are the lines from Java language specification
4.3.1 Objects An object is a class instance or an array. The reference
values (often just references) are
pointers to these objects, and a
special null reference, which refers
to no object.
Does that mean it is pointer all the time?

In modern JVMs, references are implemented as an address.
Going back to the first version of HotSpot (and a bit earlier for the "classic VM"), references were implemented as handles. That is a fixed pointer to a pointer. The first pointer never changes for any particular object, but as the object data itself is moved the second pointer is changed. Obviously this impacts performance in use, but is easier to write a GC for.
In the latest builds of JDK7 there is support for "compressed oops". I believe BEA JRockit has had this for some time. Moving to 64 bit systems requires twice as much memory and hence bandwidth for addresses. "Compressed oops" takes advantage of the least significant three or four bits of address always being zero. 32 bits of data are shifted left three or four bits, allowing 32 or 64 GB of heap instead of 4 GB.

You can actually go and get the source code from here: http://download.java.net/jdk6/source/
The short answer to your question is: yes, there is a pointer to a memory location for your java variables (and a little extra). However this is a gigantic oversimplification. There are many many many C++ objects involved in moving java variables around in the VM. If you want to get dirty take a look at the hotspot\src\share\vm\oops package.
In practice none of this matters to developing java though, as you have no direct way of working with it (and secondly you wouldn't want to, the JVM is optimized for various processor architectures).

The answer is going to depend on every JVM implementation, but the best way to think of it is as a handle. It is a value that the JVM can look up in a table or some other such implementation the memory location of the reference. That way the JVM can move objects around in memory during garbage collection without changing the memory pointers everywhere.

A primitive type is always passed by value.
where as a Class Variable is actually a reference variable for the Object.
Consider a primitive type:
int i=0;
now the value of this primitive type is stored in a memory location of address 2068.
Every time you use this primitive type as a parameter, a new copy is created as it is not pass by reference but pass by value.
Now consider a class variable:
MyClass C1 = new MyClass();
Now this creates an object of the class type MyClass with a variable name C1.
The class variable C1 contains an address of the memory location of the object which is linked to the Valriable C1. So basically the class variable C1 points to the object location(new MyClass()).
And primitive types are stored in stack and objects in heaps.

Does that mean it is pointer all the time?
Yes, but it can't be manipulated as you normally do in C.
Bear in mind that being Java a different programming language that relies on its VM, this concept ( pointer ) should be used only as an analogy to understand better the behavior of such artifacts.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.