For some Java byte code parser project I read the JVM spec and figured out that the bit mask values of the Java virtual machine class file format access modifier fields are
ACC_PUBLIC = 0x0001
ACC_FINAL = 0x0010
ACC_SUPER = 0x0020 # old invokespecial instruction semantics (Java 1.0x?)
ACC_INTERFACE = 0x0200
ACC_ABSTRACT = 0x0400
ACC_SYNTHETIC = 0x1000
ACC_ANNOTATION = 0x2000
ACC_ENUM = 0x4000
Somehow I have no idea what 0x1000 is for. I saw it once in an inner class, but for all inner classes I checked since then, this flag was never set. Do you now what the meaning of this flag is and where/when it is set?
A synthetic element is any element that is present in a compiled class file but not in the source code it is compiled from. By checking an element for it being synthetic, you allow a distinction of such elements for tools that process code reflectively. This is of course first of all relevant to libraries that use reflection but it is also relevant for other tools like IDEs that do not allow you to call synthetic methods or to work with synthetic classes. Finally, it is also important for the Java compiler to verify code during its compilation to never directly use synthetic elements. Synthetic elements are only used to make the Java runtime happy which simply processes (and verifies) the delivered code where it treats synthetic elements identically to any other element.
You already mentioned inner classes as an example where synthetic elements are inserted by the Java compiler, so let us look at such a class:
class Foo {
private String foo;
class Bar {
private Bar() { }
String bar() {
return foo;
}
}
Bar bar() {
return new Bar();
}
}
This compiles perfectly fine but without synthetic elements, it would be refused by a JVM that does not know a thing about inner classes. The Java compiler desugares the above class to something like the following:
class Foo {
private String foo;
String access$100() { // synthetic method
return foo;
}
Foo$Bar bar() {
return new Foo$Bar(this, (Foo$1)null);
}
Foo() { } // NON-synthetic, but implicit!
}
class Foo$Bar {
private final Foo $this; // synthetic field
private Foo$Bar(Foo $this) { // synthetic parameter
this.$this = $this;
}
Foo$Bar(Foo $this, Foo$1 unused) { // synthetic constructor
this($this);
}
String bar() {
return $this.access$100();
}
}
class Foo$1 { /*empty, no constructor */ } // synthetic class
As said, the JVM does not know about inner classes but enforces private access of members, i.e. an inner class would not be able to access its enclosing classes' private properties. Thus, the Java compiler needs to add so-called accessors to an accessed class in order to expose its non-visible properties:
The foo field is private and can therefore only be accessed from within Foo. The access$100 method exposes this field to its package in which an inner class is always to be found. This method is synthetic as it is added by the compiler.
The Bar constructor is private and can therefore only be called from within its own class. In order to instantiate an instance of Bar, another (synthetic) constructor needs to expose the construction of an instance. However, constructors have a fixed name (internally, they are all called <init>), thus we cannot apply the technique for method accessors where we simply named them access$xxx. Instead, we make constructor accessors unique by creating a synthetic type Foo$1.
In order to access its outer instance, an inner class needs to store a reference to this instance which is stored in a synthetic field $this. This reference needs to be handed to the inner instance by a synthetic parameter in the constructor.
Other examples for synthetic elements are classes that represent lambda expressions, bridge methods when overriding methods with a type-divergent signatures, the creation of Proxy classes or classes that are created by other tools like Maven builds or runtime code generators such as Byte Buddy (shameless plug).
It's the "synthetic" flag, set when the field or method is generated by the compiler. AFAIK it's for inner classes, which meshes with your observation, and must be set when an artifact doesn't appear in the source code.
http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#88571
Related
Reading documentation:
https://docs.oracle.com/javase/9/docs/api/java/util/ServiceLoader.html
I was confused by this sentence:
The service provider class file has more than one public static no-args method named "provider".
Assuming JavaDoc is correct, and assuming that static members are not inherited, how this might be possible in Java? Or it's an error in JavaDoc?
how this might be possible in Java?
It is not, since method signatures must be unique, and the signature is the method name and parameter types.
Except, that is not how the JVM works, only how Java works.
The JVM includes the return type as part of the signature, so technically a class can have multiple methods with the same name and parameters, but different return types.
So it can definitely happen for classes written in other JVM languages, but can it happen for a Java class?
Yes it can, when you have covariant return types for overridden methods. See Can overridden methods differ in return type?
Given the example in that answer, what really happens is this fake code:
class ShapeBuilder {
...
public Shape build() {
....
}
class CircleBuilder extends ShapeBuilder{
...
#Override
public bridge Shape build() { // override matches full JVM signature
return <Circle>build(); // call other method with different return type
}
public Circle build() {
....
}
The "bridge" method is a hidden method generated by the compiler to make the difference between Java and the JVM work correctly.
FYI: In that aspect, "bridge" methods are similar to "synthetic" methods, which are generated by the compiler to allow outer classes access to private members of inner classes, or vice versa.
As succinctly described here, overriding private methods in Java is invalid because a parent class's private methods are "automatically final, and hidden from the derived class". My question is largely academic.
How is it not a violation of encapsulation to not allow a parent's private method to be "overridden" (ie, implemented independently, with the same signature, in a child class)? A parent's private method cannot be accessed or inherited by a child class, in line with principles of encapsulation. It is hidden.
So, why should the child class be restricted from implementing its own method with the same name/signature? Is there a good theoretical foundation for this, or is this just a pragmatic solution of some sort? Do other languages (C++ or C#) have different rules on this?
You can't override a private method, but you can introduce one in a derived class without a problem. This compiles fine:
class Base
{
private void foo()
{
}
}
class Child extends Base
{
private void foo()
{
}
}
Note that if you try to apply the #Override annotation to Child.foo() you'll get a compile-time error. So long as you have your compiler/IDE set to give you warnings or errors if you're missing an #Override annotation, all should be well. Admittedly I prefer the C# approach of override being a keyword, but it was obviously too late to do that in Java.
As for C#'s handling of "overriding" a private method - a private method can't be virtual in the first place, but you can certainly introduce a new private method with the same name as a private method in the base class.
Well, allowing private methods to be overwritten will either cause a leak of encapsulation or a security risk. If we assume that it were possible, then we’d get the following situation:
Let's say that there's a private method boolean hasCredentials() then an extended class could simply override it like this:
boolean hasCredentials() { return true; }
thus breaking the security check.
The only way for the original class to prevent this would be to declare its method final. But now, this is leaks implementation information through the encapsulation, because a derived class now cannot create a method hasCredentials any more – it would clash with the one defined in the base class.
That’s bad: lets say this method doesn’t exist at first in Base. Now, an implementor can legitimately derive a class Derived and give it a method hasCredentials which works as expected.
But now, a new version of the original Base class is released. Its public interface doesn’t change (and neither do its invariants) so we must expect that it doesn’t break existing code. Only it does, because now there’s a name clash with a method in a derived class.
I think the question stems from a misunderstanding:
How is it /not/ a violation of encapsulation to not allow a parent's private method to be "overridden" (ie, implemented independently, with the same signature, in a child class)
The text inside the parentheses is the opposite of the text before it. Java does allow you to “independently implement [a private method], with the same signature, in a child class”. Not allowing this would violate encapsulation, as I’ve explained above.
But “to not allow a parent's private method to be "overridden"” is something different, and necessary to ensure encapsulation.
"Do other languages (C++ or C#) have different rules on this?"
Well, C++ has different rules: the static or dynamic member function binding process and the access privileges enforcements are orthogonal.
Giving a member function the private access privilege modifier means that this function can only be called by its declaring class, not by others (not even the derived classes). When you declare a private member function as virtual, even pure virtual (virtual void foo() = 0;), you allow the base class to benefit from specialization while still enforcing the access privileges.
When it comes to virtual member functions, access privileges tells you what you are supposed to do:
private virtual means that you are allowed to specialize the behavior but the invocation of the member function is made by the base class, surely in a controlled fashion
protected virtual means that you should / must invoke the upper class version of the member function when overriding it
So, in C++, access privilege and virtualness are independent of each other. Determining whether the function is to be statically or dynamically bound is the last step in resolving a function call.
Finally, the Template Method design pattern should be preferred over public virtual member functions.
Reference: Conversations: Virtually Yours
The article gives a practical use of a private virtual member function.
ISO/IEC 14882-2003 §3.4.1
Name lookup may associate more than one declaration with a name if it finds the name to be a function name; the declarations are said to form a set of overloaded functions (13.1). Overload resolution (13.3) takes place after name lookup has succeeded. The access rules (clause 11) are considered only once name lookup and function overload resolution (if applicable) have succeeded. Only after name lookup, function overload resolution (if applicable) and access checking have succeeded are the attributes introduced by the name’s declaration used further in expression processing (clause 5).
ISO/IEC 14882-2003 §5.2.2
The function called in a member function call is normally selected according to the static type of the object expression (clause 10), but if that function isvirtualand is not specified using aqualified-idthen the function actually called will be the final overrider (10.3) of the selected function in the dynamic type of the object expression [Note: the dynamic type is the type of the object pointed or referred to by the current value of the object expression.
A parent's private method cannot be accessed or inherited by a child class, inline with principles of encapsulation. It is hidden.
So, why should the child class be
restricted from implementing its own
method with the same name/signature?
There is no such restriction. You can do that without any problems, it's just not called "overriding".
Overridden methods are subject to dynamic dispatch, i.e. the method that is actually called is selected at runtime depending on the actual type of the object it's called on. With private method, that does not happen (and should not, as per your first statement). And that's what is meant by the statement "private methods can't be overridden".
I think you're misinterpreting what that post says. It's not saying that the child class is "restricted from implementing its own method with the same name/signature."
Here's the code, slightly edited:
public class PrivateOverride {
private static Test monitor = new Test();
private void f() {
System.out.println("private f()");
}
public static void main(String[] args) {
PrivateOverride po = new Derived();
po.f();
});
}
}
class Derived extends PrivateOverride {
public void f() {
System.out.println("public f()");
}
}
And the quote:
You might reasonably expect the output to be “public f( )”,
The reason for that quote is that the variable po actually holds an instance of Derived. However, since the method is defined as private, the compiler actually looks at the type of the variable, rather than the type of the object. And it translates the method call into invokespecial (I think that's the right opcode, haven't checked JVM spec) rather than invokeinstance.
It seems to be a matter of choice and definition. The reason you can't do this in java is because the specification says so, but the question were more why the specification says so.
The fact that C++ allows this (even if we use virtual keyword to force dynamic dispatch) shows that there is no inherent reason why you couldn't allow this.
However it seem to be perfectly legal to replace the method:
class B {
private int foo()
{
return 42;
}
public int bar()
{
return foo();
}
}
class D extends B {
private int foo()
{
return 43;
}
public int frob()
{
return foo();
}
}
Seems to compile OK (on my compiler), but the D.foo is not related to B.foo (ie it doesn't override it) - bar() always return 42 (by calling B.foo) and frob() always returns 43 (by calling D.foo) no matter whether called on a B or D instance.
One reason that Java does not allow override the method would be that they didn't like to allow the method to be changed as in Konrad Rudolph's example. Note that C++ differs here as you need to use the "virtual" keyword in order to get dynamic dispatch - by default it hasn't so you can't modify code in base class that relies on the hasCredentials method. The above example also protects against this as the D.foo does not replace calls to foo from B.
When the method is private, it's not visible to its child. So there is no meaning of overriding it.
I apologize for using the term override incorrectly and inconsistent with my description. My description describes the scenario. The following code extends Jon Skeet's example to portray my scenario:
class Base {
public void callFoo() {
foo();
}
private void foo() {
}
}
class Child extends Base {
private void foo() {
}
}
Usage is like the following:
Child c = new Child();
c.callFoo();
The issue I experienced is that the parent foo() method was being called even though, as the code shows, I was calling callFoo() on the child instance variable. I thought I was defining a new private method foo() in Child() which the inherited callFoo() method would call, but I think some of what kdgregory has said may apply to my scenario - possibly due to the way the derived class constructor is calling super(), or perhaps not.
There was no compiler warning in Eclipse and the code did compile. The result was unexpected.
Beyond anything said before, there's a very semantic reason for not allowing private methods to be overridden...THEY'RE PRIVATE!!!
If I write a class, and I indicate that a method is 'private', it should be completely unseeable by the outside world. Nobody should be able access it, override it, or anything else. I simply ought to be able to know that it is MY method exclusively and that nobody else is going to muck with it or depend on it. It could not be considered private if someone could muck with it. I believe that it's that simple really.
A class is defined by what methods it makes available and how they behave. Not how those are implemented internally (e.g. via calls to private methods).
Because encapsulation has to do with behavior and not implementation details, private methods have nothing to do with the idea encapsulation. In a sense, your question makes no sense. It's like asking "How is putting cream in coffee not a violation of encapsulation?"
Presumably the private method is used by something that is public. You can override that. In doing so, you've changed behavior.
I have a class which contains:
a public, static inner enum
private constants referring to an int and to the enum
a private inner class
Referring to the enum constant from inside the inner class generates a synthetic accessor warning, but referring to the int constant does not.
Why is this? How can I prevent this warning without making the enum constant package scope or disabling the warning? (e.g. perhaps by making an explicit accessor?)
Code reproducing the problem:
public class Outer {
public static enum Bar {
A, B, C;
}
private static final int FOO = 0;
private static final Outer.Bar BAR = Outer.Bar.A;
private class Inner {
{
int foo = Outer.FOO; // just fine
Outer.Bar bar = Outer.BAR; // synthetic accessor method warning
}
}
}
The JLS requires that accesses to compile time constants be "inlined" into the consuming class, rather than leaving a symbolic reference to be resolved at link and run time. In particular, from JLS 13.1:
References to fields that are constant variables (§4.12.4) are
resolved at compile time to the constant value that is denoted. No
reference to such a field should be present in the code in a binary
file (except in the class or interface containing the field, which
will have code to initialize it).
In particular, this means that you class Inner doesn't have any trace of Outer.FOO in its compiled class file. Rather, the constant value 0 of FOO is directly compiled into Inner at compile time.
Synthetic access methods are (among other reasons) added to ease runtime access to private fields between nested classes - but in the case of FOO, the access is already resolved at compile time, so no method is needed.
Outer.BAR on the other hand, is not eligible for compile-time constant treatment - because objects other than certain String objects are not compile time constants per 15.28:
A compile-time constant expression is an expression denoting a value
of primitive type or a String that does not complete abruptly and is
composed using only the following...
An Enum is neither a primitive nor a String, so the special handling for compile-time constants doesn't apply, and a synthetic accessor method is generated.
As succinctly described here, overriding private methods in Java is invalid because a parent class's private methods are "automatically final, and hidden from the derived class". My question is largely academic.
How is it not a violation of encapsulation to not allow a parent's private method to be "overridden" (ie, implemented independently, with the same signature, in a child class)? A parent's private method cannot be accessed or inherited by a child class, in line with principles of encapsulation. It is hidden.
So, why should the child class be restricted from implementing its own method with the same name/signature? Is there a good theoretical foundation for this, or is this just a pragmatic solution of some sort? Do other languages (C++ or C#) have different rules on this?
You can't override a private method, but you can introduce one in a derived class without a problem. This compiles fine:
class Base
{
private void foo()
{
}
}
class Child extends Base
{
private void foo()
{
}
}
Note that if you try to apply the #Override annotation to Child.foo() you'll get a compile-time error. So long as you have your compiler/IDE set to give you warnings or errors if you're missing an #Override annotation, all should be well. Admittedly I prefer the C# approach of override being a keyword, but it was obviously too late to do that in Java.
As for C#'s handling of "overriding" a private method - a private method can't be virtual in the first place, but you can certainly introduce a new private method with the same name as a private method in the base class.
Well, allowing private methods to be overwritten will either cause a leak of encapsulation or a security risk. If we assume that it were possible, then we’d get the following situation:
Let's say that there's a private method boolean hasCredentials() then an extended class could simply override it like this:
boolean hasCredentials() { return true; }
thus breaking the security check.
The only way for the original class to prevent this would be to declare its method final. But now, this is leaks implementation information through the encapsulation, because a derived class now cannot create a method hasCredentials any more – it would clash with the one defined in the base class.
That’s bad: lets say this method doesn’t exist at first in Base. Now, an implementor can legitimately derive a class Derived and give it a method hasCredentials which works as expected.
But now, a new version of the original Base class is released. Its public interface doesn’t change (and neither do its invariants) so we must expect that it doesn’t break existing code. Only it does, because now there’s a name clash with a method in a derived class.
I think the question stems from a misunderstanding:
How is it /not/ a violation of encapsulation to not allow a parent's private method to be "overridden" (ie, implemented independently, with the same signature, in a child class)
The text inside the parentheses is the opposite of the text before it. Java does allow you to “independently implement [a private method], with the same signature, in a child class”. Not allowing this would violate encapsulation, as I’ve explained above.
But “to not allow a parent's private method to be "overridden"” is something different, and necessary to ensure encapsulation.
"Do other languages (C++ or C#) have different rules on this?"
Well, C++ has different rules: the static or dynamic member function binding process and the access privileges enforcements are orthogonal.
Giving a member function the private access privilege modifier means that this function can only be called by its declaring class, not by others (not even the derived classes). When you declare a private member function as virtual, even pure virtual (virtual void foo() = 0;), you allow the base class to benefit from specialization while still enforcing the access privileges.
When it comes to virtual member functions, access privileges tells you what you are supposed to do:
private virtual means that you are allowed to specialize the behavior but the invocation of the member function is made by the base class, surely in a controlled fashion
protected virtual means that you should / must invoke the upper class version of the member function when overriding it
So, in C++, access privilege and virtualness are independent of each other. Determining whether the function is to be statically or dynamically bound is the last step in resolving a function call.
Finally, the Template Method design pattern should be preferred over public virtual member functions.
Reference: Conversations: Virtually Yours
The article gives a practical use of a private virtual member function.
ISO/IEC 14882-2003 §3.4.1
Name lookup may associate more than one declaration with a name if it finds the name to be a function name; the declarations are said to form a set of overloaded functions (13.1). Overload resolution (13.3) takes place after name lookup has succeeded. The access rules (clause 11) are considered only once name lookup and function overload resolution (if applicable) have succeeded. Only after name lookup, function overload resolution (if applicable) and access checking have succeeded are the attributes introduced by the name’s declaration used further in expression processing (clause 5).
ISO/IEC 14882-2003 §5.2.2
The function called in a member function call is normally selected according to the static type of the object expression (clause 10), but if that function isvirtualand is not specified using aqualified-idthen the function actually called will be the final overrider (10.3) of the selected function in the dynamic type of the object expression [Note: the dynamic type is the type of the object pointed or referred to by the current value of the object expression.
A parent's private method cannot be accessed or inherited by a child class, inline with principles of encapsulation. It is hidden.
So, why should the child class be
restricted from implementing its own
method with the same name/signature?
There is no such restriction. You can do that without any problems, it's just not called "overriding".
Overridden methods are subject to dynamic dispatch, i.e. the method that is actually called is selected at runtime depending on the actual type of the object it's called on. With private method, that does not happen (and should not, as per your first statement). And that's what is meant by the statement "private methods can't be overridden".
I think you're misinterpreting what that post says. It's not saying that the child class is "restricted from implementing its own method with the same name/signature."
Here's the code, slightly edited:
public class PrivateOverride {
private static Test monitor = new Test();
private void f() {
System.out.println("private f()");
}
public static void main(String[] args) {
PrivateOverride po = new Derived();
po.f();
});
}
}
class Derived extends PrivateOverride {
public void f() {
System.out.println("public f()");
}
}
And the quote:
You might reasonably expect the output to be “public f( )”,
The reason for that quote is that the variable po actually holds an instance of Derived. However, since the method is defined as private, the compiler actually looks at the type of the variable, rather than the type of the object. And it translates the method call into invokespecial (I think that's the right opcode, haven't checked JVM spec) rather than invokeinstance.
It seems to be a matter of choice and definition. The reason you can't do this in java is because the specification says so, but the question were more why the specification says so.
The fact that C++ allows this (even if we use virtual keyword to force dynamic dispatch) shows that there is no inherent reason why you couldn't allow this.
However it seem to be perfectly legal to replace the method:
class B {
private int foo()
{
return 42;
}
public int bar()
{
return foo();
}
}
class D extends B {
private int foo()
{
return 43;
}
public int frob()
{
return foo();
}
}
Seems to compile OK (on my compiler), but the D.foo is not related to B.foo (ie it doesn't override it) - bar() always return 42 (by calling B.foo) and frob() always returns 43 (by calling D.foo) no matter whether called on a B or D instance.
One reason that Java does not allow override the method would be that they didn't like to allow the method to be changed as in Konrad Rudolph's example. Note that C++ differs here as you need to use the "virtual" keyword in order to get dynamic dispatch - by default it hasn't so you can't modify code in base class that relies on the hasCredentials method. The above example also protects against this as the D.foo does not replace calls to foo from B.
When the method is private, it's not visible to its child. So there is no meaning of overriding it.
I apologize for using the term override incorrectly and inconsistent with my description. My description describes the scenario. The following code extends Jon Skeet's example to portray my scenario:
class Base {
public void callFoo() {
foo();
}
private void foo() {
}
}
class Child extends Base {
private void foo() {
}
}
Usage is like the following:
Child c = new Child();
c.callFoo();
The issue I experienced is that the parent foo() method was being called even though, as the code shows, I was calling callFoo() on the child instance variable. I thought I was defining a new private method foo() in Child() which the inherited callFoo() method would call, but I think some of what kdgregory has said may apply to my scenario - possibly due to the way the derived class constructor is calling super(), or perhaps not.
There was no compiler warning in Eclipse and the code did compile. The result was unexpected.
Beyond anything said before, there's a very semantic reason for not allowing private methods to be overridden...THEY'RE PRIVATE!!!
If I write a class, and I indicate that a method is 'private', it should be completely unseeable by the outside world. Nobody should be able access it, override it, or anything else. I simply ought to be able to know that it is MY method exclusively and that nobody else is going to muck with it or depend on it. It could not be considered private if someone could muck with it. I believe that it's that simple really.
A class is defined by what methods it makes available and how they behave. Not how those are implemented internally (e.g. via calls to private methods).
Because encapsulation has to do with behavior and not implementation details, private methods have nothing to do with the idea encapsulation. In a sense, your question makes no sense. It's like asking "How is putting cream in coffee not a violation of encapsulation?"
Presumably the private method is used by something that is public. You can override that. In doing so, you've changed behavior.
When passing a final object (A String in the code below) it shows up as null when printed from the anonymous inner class. However, when a final value type or straight final String is passed in, its value is correctly displayed. What does final really mean in context of the anonymous inner class and why is the object passed null?
public class WeirdInners
{
public class InnerThing
{
public InnerThing()
{
print();
}
public void print(){
}
}
public WeirdInners()
{
final String aString = "argh!".toString();
final String bString = "argh!";
System.out.println(aString);
System.out.println(bString);
InnerThing inner =new InnerThing(){
public void print()
{
System.out.println("inner"+aString); // When called from constructor, this value is null.
System.out.println("inner"+bString); // This value is correctly printed.
}
};
inner.print();
}
public static void main(String[] args)
{
WeirdInners test1 = new WeirdInners();
}
}
This is very strange behavior to me because the expectation is that the String is an object, why does calling toString() change things?
Other info: this behavior is only observed using Java 1.4, not in Java 5. Any suggestions on a workaround? Not calling toString() on an existing String is fair enough, but since this is only an example, it has real world implications if I perform it on a non-String object.
If you check the section on compile-time constants in the JLS, you'll see that calling .toString() does make a difference. As does nonsense like prefixing with false?null+"":.
What is important here is the relative ordering of setting the fields that are closed over and the constructor. If you use -target 1.4 or later (which is not the default in 1.4!) then the fields will be copied before calling the super. With the spec before 1.3 this was illegal bytecode.
As is often the case in these cases, javap -c is useful to see what the javac compiler is doing. The spec is useful for understanding why (should you have sufficient patience).
My guess would be that you trigger undefined behaviour when the constructor of InnerThing() passes (implicitly) its this to the print method of an anonymous subclass of InnerThing while the object is not fully constructed. This this in turn relies on an implicit reference to the this of WierdInners.
Calling .toString() moves initialisation of aString from compile time to runtime. Why the undefined behaviour differs between Java 1.4 and 1.5 is a JVM implementation detail probably.
It is dangerous to invoke overridden methods from a superclass constructor, as they will be called before the subclass-part of the object is initialized.
Moreover, an inner class accessing final variables of an enclosing scope will actually access copies of those variables (that's why they need to be final), and these copied fields reside in the anonymous subclass.
I suspect the reason bString is treated differently is that its value is known at compile time, which allows the compiler to inline the field access in the subclass, making the time that field is initialized irrelevant.