Java class construction under the hood

Java class construction under the hood - java

Consider we have classes like this:
class A {
public B b;
public void someFunc() { // called sometime
b = new B();
}
}
Class B's constructor assigns some inner variables.
Field b is not thread-safe in the sense another thread can view b not-null when B constructor hasn't finished. (during someFunc execution)
My question is the following: how can it be (from the logic perspective) that the constructor hasn't finished yet?
For me reordering of such kind is magic.

In the context of thread safety, this usually happens because of just-in-time (JIT) compilers. A JIT compiler takes Java byte code and translates it in to machine code to make it run faster. During translation, it's able to make a lot of optimizations, such as inlining various methods and constructors.
Supposing B had a constructor like this:
class B {
int x;
B(int x) { this.x = x; }
}
When a constructor is inlined, it takes Java code that's something like this:
b = new B(1);
And translates it to machine code that takes steps similar to the following:
Allocate space for a B object somehow.
Store the pointer to that memory in to b.
Store 1 in b.x.
In other words, code which is analogous to this (in terms of ordering):
b = new B();
b.x = 1;
But we don't actually call a constructor at all. We'd just allocate a B, however the JVM does it internally, and assign b.x directly. Calling the constructor would involve jump instructions, so it's a bit faster to inline it.
There's an example like that in the famous "Double-Checked Locking is Broken" Declaration.
A regular Java compiler would be allowed to inline constructors too, but regular Java compilers don't typically perform many optimizations.

Instance of object can "escape" from constructor, like that:
public class EscapeDemo {
static void escape(B b) {
System.out.println(b.strA);
System.out.println(b.strB); // still null in this example, even if field is final and initialized to non-null value.
}
public static void main(String[] args) {
System.out.println(new B());
}
}
class B {
final String strA;
final String strB;
B() {
strA = "some operations";
EscapeDemo.escape(this);
strB = "here";
}
}
prints:
some operations
null
B#hashcode
And in similar way that reference could escape to some code that will use it from other thread.
Like Andy Guibert added in comment: this is a bad practice to write such code - as it might be source of many weird errors and hard to trace bug - like here we have something that should not be a null, but it is a null.
And if you want to do something with object instance on creation it is much better idea to create static factory method that will create instance and then do something with (like add to some collection/registry) it and then return it.
Also if you include usage of weird code nad hacks - in bytecode java object creation is separated from constructor call, so it is possible from bytecode level to create an object, pass it somewhere and call constructor in some other place.
But otherwise field is assigned after right side expression is executed so for code
b = new B();
field b can be only null or B instance after constructor is called. Unless you would set that field from inside of B constructor like in my escape example.

If I understand your question correctly. You are asking about if you create an instance of Object A which has a field b of type B and the field b is not initialized when A is created but only some other object calls someFunc(). What will happen when some other thread tries to access this field b?
If so, when you create a new object of type B the JVM will allocate some memory for this object and then will return a reference which will be held in the field b. If other thread tries to access the field b before it got the reference of the new object, it will return null otherwise will return the reference to the newly created object.

Related

Subclass object with superclass funcionality

I'm confused why this code works.
The object "person" is declared as an instance of B with only the functionality of A, yet it somehow prints out hello twice. If the object only gets access to the methods in A, how can it end up accessing the print out statement?
abstract class A {
abstract void move();
void go() {
this.move();
}
}
class B extends A {
void move() {
System.out.println("hello");
}
}
public class Main {
public static void main(String[] args){
A person = new B();
person.move();
person.go();
}

No, you have a "link" of class A, but functionality depends on object (in your code it is new B();), so your A link points to a B object with B functionality.

The object "person" is declared as an instance of B with only the functionality of A
Not exactly. Java is a reference-based language; here you have made an instance (new B()), and have exactly one reference to it, called person.
You could have no references to it (leading to the created object being garbage collected eventually). You can have 2000 references to it. You could have B b = new B(); A a = b;, and now you have 2 references to it, one of type A and one of type B. But they are pointing to the exact same object.
Thus, person isn't an object, it's a reference. And that reference has type A, sure, but the object it is referencing is just a B, it isn't 'restricted' to have only A functionality.
You probably know all this, but terminology is important here, as it seems to have led to some confusion. Restating your question with less ambiguous terminology:
The reference "person" currently references an instance of B, but only exposes functionality that A has.
Yes. And A's functionality is described solely by the signatures present there, not by the code. A's functionality involves having a go() method, which takes no arguments and returns nothing, as well as a move() method with the same rules. And that is all - 'invoking go will actually run the move method', or 'move has no implementation' is not part of that.
Thus, both move() and go() are part of the functionality exposed by A.
The implementation of the functionality has nothing whatsoever to do with the reference type (A person) and everything to do with what the reference is actually pointing at (new B()). B's implementation of the move() and go() methods specified by A are such that move prints "hello", and go() calls move (thus, prints hello()) - that implementation is inherited from A, but B was free to change that; B however decided not to.
Said more technically:
Java uses something called dynamic dispatch. What that means is that at compile time (i.e. write time), java figures out if a method you're calling even exists and which variant (if you have move(String a) and move(int a), that's two methods with the same name, at write time java decides which one you are attempting to invoke), and uses the type of the expression in front of the dot to figure that out. But then at runtime, the actual type of the object that 'the expression in front of the dot' is actually pointing at is used to figure out which actual code to invoke. This always happens and you can't opt out (you can't choose to run A's implementation when you invoke main(). Only B's implementation can choose to not override A's implementation, or to explicitly invoke its supertype's implementation. A user of B's code cannot do that).
Note that static stuff doesn't 'do' inheritance at all, and therefore, dynamic dispatch doesn't apply there.

Using a constructor within another in Java

Consider:
int a = 0;
int b = 3;
//Constructor 1
public ClassName (int a) {
this(a, b); //Error
//new ClassName(a, b) //No error
}
//Constructor 2
public ClassName (int a, int b) {
this.a = a;
this.b = b;
}
First question:
I get an error saying "b should be static". Why can't I use the default value (3) for b in this way?
Second question:
In the first constructor, if I use the comment outed part, I do not get an error. Is it an acceptable usage?

The use of instance variables in an explicit constructor invocation is prohibited by the JLS, Section 8.8.7.1.
An explicit constructor invocation statement in a constructor body may not refer to any instance variables or instance methods or inner classes declared in this class or any superclass, or use this or super in any expression; otherwise, a compile-time error occurs.
This prohibition on using the current instance explains why an explicit constructor invocation statement is deemed to occur in a static context (§8.1.3).
You referenced the instance variable b. The compiler didn't raise this error on a because it's a local variable, which shadows the instance variable a.
The "static context" may be why your IDE is suggesting to make b static, so it can be referenced. It makes sense; the ClassName part of the object isn't constructed yet. Replace that usage of b with something else, such as a static constant, or a literal int value.
To answer your other question, typing new ClassName(a, b) isn't an error because at this point, this instance has been constructed, and you're creating a separate, unrelated ClassName object.

First question: I get an error saying "b should be static". Why can't I use the default value (3) for b in this way?
The correct way to supply a default value for b to the other constructor is this(a, 3);. You cannot refer to instance variables until after this(...). That's just one of the rules of the language.
Second question: In the first constructor if I use the comment outed part I do not get an error. Is it an acceptable usage?
new ClassName(a, b); does something different. It creates a separate instance of the class. In my humble opinion it is probably not the best thing to do. People expect new to create one instance of a class, whereas using new ClassName in the constructor creates two.

When using variables in classes, it is important to note where that validity of scope is. You've instantiated new a,b variants of the variables there. You're tricking yourself into believing those are the same variables. Actually they're in another address space. If you want to use your class variables you'll have to take out the parameters to the functions. Then they'll sync with the class you're in, rather than isolating the arguments a, b to within the scope of your function,

If any, what class specific actions happen when declaring an empty object field of a given class in java?

Imagine a hierachical set of Classes, where Class A has a field Class B, and Class B has field of Class C. The fields are set in the constructor of each class.
Now if I create an object instance of a Class X, with a field "a" of Class A, where "a" is never set, and remains null:
If any, what class specific "actions" happen from the object "a"? Will it call anything at all from it's own fields? Does Class B or C react at all? I imagine that there might be memory allocation or similar, but I am not sure at all. The reason why I am asking, is to get a better understanding of the data flow, and sequence of actions in applications.
I have tried to find an answer to this question for a while, but I can't seem to find the right way to ask, as the question is a little too close to basic questions about how to define objects in Java.

So lets assume the following classes definition:
public class A {
private B b;
public A() {
b = new B();
}
}
public class B {
private C c;
public B() {
c = new C();
}
}
public class C {
public C() {
}
}
public class X {
private A a;
public X() {
}
}
Now let assume that the following main is being executed:
public static final void main (String[] argv) {
X x = new X();
}
Here an instance of X is create in the memory heap and a reference to this object is store in the x variable.
Since no value is assign to the a variable during the class construction, then no instance of A is created. The a variable still takes up space in memory as part of the instance of X that was created (in other words, it still needs enough space to be able to store a reference), but in this case a is assigned the null value (from Java Language Specification §4.12.5)
For all reference types (§4.3), the default value is null.
Now lets modify the X class as follow
public class X {
private A a;
public X() {
a = new A();
}
}
If we were to execute the main() method again with this modified version of X, then as part of the construction process, X would cause an instance of A to be created which would cause an instance of B to be created which in turn would cause an instance of class C to be created. All these instances would take up space in the memory heap and the reference to these objects would be stored in their respective variables.

Rather than thinking of reference-type variables as holding "pointers", I think it more helpful to think of them as holding object identifiers. If X is a variable of class type Thing, then X holds either information sufficient to identify an instance of Thing or a class derived therefrom, or else information sufficient to say that it does not. Although reference-type variables in many Java implementations hold pointers of some sort, there's no requirement that they do so. A Java implementation which wanted to access more than four gigs of memory without having to use 64-bit object references could round all object sizes up to the next multiple of 16 bytes and then have each non-zero object reference store a scaled offset into the heap (so an object which is located 32016 bytes above the start of the heap would store the number "2001" [decimal] as a reference). Although Java doesn't say what the bit pattern associated with a reference means, the one thing it does specify is that the bit pattern with which array slots and object fields are initialized will never identify any object.

Is there a name for "this" in Java?

Eclipse will give an error, "The left-hand side of an assignment must be a variable", when I try something like:
public class Thing{
String a1;
int a2;
public void meth(){
Thing A = new Thing();
this = A;
}
}
I had to assign each variable (this.a1 = A.a1; this.a2 = A.a2;) as a work around.
Are there other ways to do this without going through each variable field?
And if this is not a variable what is it called?

this is a pseudo-variable that points to the current instance of the object, it can not be reassigned. It's also considered a keyword in the language, according to section §3.9 of the Java Language Specification.

No, there is no easy shortcut.
And if "this" is not a variable what is it called?
this is not a variable, it's a keyword.
Even though this is special, in many respects it acts like a reference. Therefore, for consistency, this = A would have to be a reference assignment, which doesn't quite make sense.
You seem to be expecting this = A to perform a field-by-field copy from A to this, and indeed Java's designers could choose do that in this case. However, this would be inconsistent with other reference assignments, and the overall benefits of having this as an exception are not at all clear.

this refers to this instance of the class.
You cannot assign to this

this is a java reserved keyword which refers to the current object. its not a variable its a java reserved keyword.
so this = A; is invalid. using this keyword we can refer to any instance variable or method of the current object. you have to refer to the instance variable like:
this.a1 = A.a1;
From Doc:
The most common reason for using the this keyword is because a field
is shadowed by a method or constructor parameter.

You can't assign to this in Java. It's not a variable; it's a keyword.
One thing you might consider, if you don't need a particular instance, is just returning your new instance.
public class Thing{
String a1;
int a2;
public Thing meth(){
Thing A = new Thing();
return A;
}
}
and you'd use it like
whatever = whatever.meth();

According to java lang spec §15.8.3 this is a keyword that is either an expression or statement
When used as a primary expression this denotes a value that is a reference to the object for which the instance method was invoked.
Expression: Something which evaluates to a value. Example: x++
The keyword this is also used in a special explicit constructor invocation statement
Statement: Syntactic elements that control the execution of a program, which are executed for their effect and do not have values Example: if (true)
In either case it is not a variable
Variable: A storage location with an associated type
In your case this is an expression and not a variable. But for all intents an purposes just call it a keyword
Keyword: A character sequence, formed from ASCII letters, are reserved for use ... that cannot be used as a variable name

this refers to the owner of the method.
In this case, the owner is the object itself.
Sometime, this may not refer to the class that you are writing code. Such as in the annoymous class. A common example is the anonymous listener.
button.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e) {
this; // refers to the ActionListener
}
}
);
In addition, you can return this can do method chaining. Supposed you have a class called Homework and it has a method addTask.
public Homework addTask(String task){
return this;
}
you can call the addTask method like
homework.addTask("a").addTask("b").addTask("c");

I think the OP is asking for the ability to assign the contents of one object to another, rather than to assign a new value to the "this" pointer. C++ has this ability -- you can override the assignment operator -- but Java has no such ability.
It would be a nice feature to have in some occasional cases, but it's simply not currently possible, and it doesn't really fit the Java "mold" to provide the function in the future.
The capability would be more useful (and there would be more motivation to provide it) if Java allowed objects to be embedded in other objects (vs simply embedding referenced), but that's not in the cards either.

There is no1 way to copy the values of all fields from one instance onto another in the basic Java language. And you should typically not need it. You can most often just replace the reference to the new instance or work directly on the target instance.
In your case when you want to reset all fields of a object to the initial values (and there is seldomly a need for it) you typically use a reset method which eighter works on its own instance or is a static one working on any given object.
So
class A {
String a1; int a2;
void reset() { a1 = ""; a2 = 0; }
}
would be used as
A a = new A();
// modify a
a.reset();
and
class A {
String a1; int a2;
static void reset(A anotherA) { anotherA.a1 = ""; anotherA.a2 = 0; }
}
and use it like:
A.reset(a);
In both cases it makes sense to use the reset method also for setting the initial values in the constructor: A() { A.reset(this); } or A() { this.reset(); }
1 actually there are some libraries to do it, and you can code it with the help of reflection, the only reason I see it is used is to implement a clone() method or for some kind of wrapping/stubbing.

It sounds to me like what you're trying to do is have a method that reinitializes your object, i.e., set's it back to it's initial values. That's why you want to create a new object, and assign it to the current object, right?
If that's the case, let's try a different way of doing it, since, as has been said, you can't reassign this.
What if, instead of doing that, you tried something like this:
public class Thing {
String a1;
int a2;
public Thing() {
this.meth();
}
public void meth() {
this.a1 = "a1";
this.a2 = 2;
}
}
This way, Thing.meth() actually initializes your object, and the constructor calls it when the object is created. Then you can call it again whenever you'd like.

==Disclaimer, I don't know java==
You would want to assign manually.
I'm not sure why you are trying to create a new instance of Thing inside Thing, but as you don't set the values of a1 and a2 you would need to assign them the way you did.
this is a reserved keyword pointing the class object it is inside.
For example, if you wanted to have another function named fish() your code may look something like this.
public class Thing{
String a1;
int a2;
public Thing meth(){
Thing A = new Thing();
return A;
}
public Thing fish(){
this.a1 = "foo";
this.meth();
return A;
}
}

When you do this = stuff; you are trying to replace the current object instance reference (in this case, the one that you are initializing in the constructor) with another thing, and (in the particular case of java) thats illegal and the language forbids you of doing it.
Think about it, if you could replace the reference to your current instance just like that, then you could incur in some serious memory and security problems (the reference to the constructed object will be lost and overrided by some unknown object).
What is totally valid is referencing members of your current object using the . operator, because they are owned by this, so no problems should arise (at least not evident ones).
The JVM has some inner security measures (e.g., method max stack size verification, class file format validation, etc) that prevents from easy binary manipulation and are enforced by the language syntax. This could be seen as one of those.

method overriding

class A
{
int i=10;
void show()
{
System.out.println("class A");
}
}
class B extends A
{
int i=5;
public void show()
{
System.out.println("class B");
}
}
class M
{
public static void main(String s[])
{
A a=new B();
a.show();
System.out.println(a.i);
}
}
OUTPUT= class B
10
If class A method is overridden by class B method then why not the variable 'i'?

Because variables are not virtual, only methods are.

It is not overwritten, but hidden. In your output you specifically requested the value of a.i, not ((B)a).i.

This is a "feature" of the implementation. In memory, this looks like so:
a:
pointer to class A
int i
b:
pointer to class B
int i (from A)
int i (from B)
When you access i in an instance of B, Java needs to know which variable you mean. It must allocate both since methods from class A will want to access their own field i while methods from B will want their own i (since you chose to create a new field i in B instead of making A.i visible in B). This means there are two i and the standard visibility rules apply: Whichever is closer will win.
Now you say A a=new B(); and that's a bit tricky because it tells Java "treat the result from the right hand side as if it were an instance of A".
When you call a method, Java follows the pointer to the class (first thing in the object in memory). There, it finds a list of methods. Methods overwrite each other, so when it looks for the method show(), it will find the one defined in B. This makes method access fast: You can simply merge all visible methods in the (internal) method list of class B and each call will mean a single access to that list. You don't need to search all classes upwards for a match.
Field access is similar. Java doesn't like searching. So when you say B b = new B();, b.i is obviously from B. But you said A a = new B() telling Java that you prefer to treat the new instance as something of type A. Java, lazy as it is, looks into A, finds a field i, checks that you can see that field and doesn't even bother to look at the real type of a anymore (because that would a) be slow and b) would effectively prevent you from accessing both i fields by casting).
So in the end, this is because Java optimizes the field and method lookup.

Why no field overrides in Java though?
Well, because instance field lookups in Java happen at compile time: Java simply gives you the value of the field at a given offset in object's memory (based on the type information at hand during compilation: in this case a is declared to be of type A).
void foo() {
A a = new B();
int val = a.i; // compiler uses type A to compute the field offset
}
One may ask "Why didn't compiler use type B since it knows that a is in fact an instance of B? Isn't it obvious from the assignment just above?". Of course, in the case above, it's relatively obvious and compiler may try to be smarter and figure it out.
But that's compiler design "rat hole", what if a "trickier" piece of code is encountered, like so:
void foo(A a) {
int val = a.i;
}
If compiler were "smarter", it would become its job to look at all invocations of foo() and see what real type was used, which is an impossible job since compiler can not predict what other crazy things may be passed to foo() by unknown or yet unwritten callers.

It's a design decision by the developers of Java, and is documented in the Java Language Specification.
A method with the same method signature as a method in its parent class overrides the method in its parent class.
A variable with the same name as a variable in its parent class hides the parent's variable.
The difference is that hidden values can be accessed by casting the variable to its parent type, while overridden methods will always execute the child class's method.
As others have noted, in C++ and C#, to get the same override behavior as Java, the methods need to be declared virtual.

a is an instance of A. You call the constructor B(). But it is still a A class.
That is why i equals 10;
The override from the method will be succeded.
Note a class starts not with
public class A()
but with;
public class A { ... }

Tip: You can use setters and getters to make sure of what data-members you use.
Or: You simply can set the values at the constructor instead of the class declaration.

Because by default the variables are private. You must declare it as "protected", then will be properly inherited.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java class construction under the hood - java

Related

Subclass object with superclass funcionality

Using a constructor within another in Java

If any, what class specific actions happen when declaring an empty object field of a given class in java?

Is there a name for "this" in Java?

method overriding

Categories

Resources