When shifting through java call graph generated by libraries like DependencyFinder and java-callgraph, I found out that java compiler generate names for anonymous functions, inner classes, etc.
I've found out the meaning of a couple of them (please correct if I'm wrong):
org.example.Bar$Foo refers to Foo, which is an inner class of org.example.Bar.
org.example.Bar$1 refers to an anonymous class declared inside one of the methods of org.example.Bar.
org.example.Bar.lambda$spam$1() refers to a lambda declared inside of org.example.Bar.spam() method.
However, I also found:
org.example.Bar$$Lambda$2.args$1
org.example.Bar$$Lambda$2.call()
org.example.Bar$$Lambda$7.lambdaFactory$()
org.example.Bar$$Lambda$7.get$Lambda()
What does the four name above refer to? What does double dollar ($$) mean?
The classes for lambda expressions are not javac generated, but created at runtime by the JRE. Their names are completely unspecified and you can’t rely on any naming scheme.
But obviously, Oracle’s current JRE has a recognizable pattern. It appends $$Lambda$n to the name of the defining class whereas n is an increasing number, which reflects the creation order at runtime, rather than any property of the compiled code.
You can verify this with the following program:
public class Test {
public static void main(String... args) {
if(args.length==0) {
final boolean meFirst = Math.random()<0.5;
if(meFirst) {
Runnable r=Test::main;
System.out.println("first run:\t"+r.getClass());
}
main("second run");
if(!meFirst) {
Runnable r=Test::main;
System.out.println("first run:\t"+r.getClass());
}
}
else {
Runnable r=Test::main;
System.out.println(args[0]+":\t"+r.getClass());
if(args[0].equals("second run")) main("last run");
}
}
}
Depending on the state of the random meFirst flag, it will print either
first run: class Test$$Lambda$1
second run: class Test$$Lambda$2
last run: class Test$$Lambda$2
or
second run: class Test$$Lambda$1
last run: class Test$$Lambda$1
first run: class Test$$Lambda$2
It shows that the first generated class always gets the number 1, regardless of whether it’s one of the first two method references instantiated in the first main invocation or the third method reference, instantiated in the first recursion. Further, the 3rd execution always encounters the same class as the 2nd, as it’s the same method reference expression (note: distinct expression, as the target of all expressions is the same) and the class is re-used.
Depending on the version, you may further see something like /number appended to the names, which hints that the names really don’t matter, as each of these classes has another unique identifier (they are so called “anonymous classes” which you can’t locate via ClassLoader and name).
Field names like args$n within these classes represent the n’th captured value. Since the LambdaMetafactory has no knowledge about the actual names of the captured variables, it has no other choice but to generate such names.
But as said, that’s an implementation artifact. It’s possible to maintain such a naming pattern, as long as a new class is generated for each creation site in each defining class. But since the specification allows arbitrary sharing/reusing of classes and instances representing equivalent lambda expressions (doing the same) and method references (targeting the same method), such a naming pattern is not possible with every implementation strategy.
Related
This question already has answers here:
Why is it that we cannot override static and final methods? [duplicate]
(5 answers)
Closed 1 year ago.
In Java, what is the actual reason behind the inability to write a method in a sub class which has the same name as a final method in the super class? (Please note that I am not trying to override the method, this is why I have put the keyword final.)
Please see the example below:
class A {
public final void method() {
System.out.println("in method A");
}
}
class B extends A {
public void method() {
System.out.println("in method B");
}
}
The problem is expressed as "'method()' cannot override 'method()' in 'A'; overridden method is final" in the IDE; however, I would like to understand what it is about this situation that leads the compiler to fail.
Because in java, overriding isn't optional.
Names of methods at the class level.
At the class level (as in, what is in a class file, and what a JVM executes), method names include their return type and their parameter types (and, of course, the name). At the JVM level, varargs doesn't exist (it's an array instead), generics do not exist (they are erased for the purposes of signature), and the throws clause isn't a part of the story. But other than that, this method:
public void foo(String foo, int bar, boolean[] baz, long... args) throws Exception {}
turns into this name at the class file level:
foo(Ljava/lang/String;I[Z[J)V
which seems like gobbledygook, but [ is 'array of', the primitives get one letter each (Z for boolean, J for longs, I for integer), V is for void, and L is for: Object type. Now it makes sense.
That really is the method name at the class level, effectively (well, we call this its signature). ANY invocation of a method in java, at the class level, always uses the complete signature. This means javac simply cannot compile a method call unless it actually knows the exact method you're invoking, which is why javac doesn't work unless you have the full classpath of everything you're calling available as you compile.
Overriding isn't optional!
At the class level, if you define a method whose full signature matches, exactly, a signature in your parent class, then it is overriding that method. Period. You can't not. #Override as an annotation doesn't affect this in the slightest (That annotation merely causes the compiler to complain if you aren't overriding anything, it's compiler-checked documentation, that's all it is).
javac goes even further
As a language thing, javac will make bridges if you want to tighten the return type. Given:
class Parent {
Object foo() { return null; }
}
class Child extends Parent {
String foo() { return null; }
}
Then at the class level, the full signature of the one method in Parent is foo()Ljava/lang/Object; whereas the one in Child has foo()Ljava/lang/String; and thus these aren't the same method and Child's foo would appear not to be overriding Parent's foo.
But javac intervenes, and DOES make these override. It does this by actually making 2 methods in Child. You can see this in action! Write the above, compile it, and run javap -c -v on Child and you see these. javac makes 2 methods: Both foo()Ljava/lang/String; and foo()Ljava/lang/Object; (which does have the same signature and thus overrides, by definition, Parent's implementation). That second one is implemented as just calling the 'real' foo (the one returning string), and gets the synthetic flag.
Final is what it is
Which finally gets to your problem: Given that final says: I cannot be overridden, then, that's it. You've made 2 mutually exclusive rules now:
Parent's foo cannot be overriden
Child's foo, by definition (because its signatures match), overrides Parent's foo
Javac will just end it there, toss an error in your face, and call it a day. If you imagined some hypothetical javac update where this combination of factors ought to result in javac making a separate method: But, how? At the class level, same signature == same method (it's an override), so what do you propose? That java add a 0 to the end of the name?
If that's the plan, how should javac deal with this:
Parent p = new Child();
p.foo();
Which foo is intended there? foo()Ljava/lang/Object; from Parent, or foo0()L/java/Object; from child?
You can write a spec that gives an answer to this question (presumably, here it's obvious: Parent's foo; had you written Child c = new Child(); c.foo(); then foo0 was intended, but that makes the language quite complicated, and for what purpose?
The java language designers did not think this is a useful exercise and therefore didn't add this complication to the language. I'm pretty sure that was clearly the right call, but your opinion may of course be different.
Final means not just that you can’t override it, it means you can’t work around having that method get called.
When you subclass the object, if you could make a method that shadows that final method, then you could prevent the superclass method from functioning, or substitute some other functionality than what the user of the object would expect. This would allow introducing malicious code and would defeat the purpose of making methods final.
In your case it sounds like making the superclass method final may not have been the best choice.
There are too many associated names: Early and Late Binding, Static and Dynamic Dispatch, Runtime vs. Compile-time Polymorphism, etc. that I don't understand the difference.
I found a clear explanation, but is it correct? I'll paraphrase JustinC:
Binding: is determining the type of a variable (object?). If it's done at compile time, its early binding. If it's done at run time, it's late binding.
Dispatch: is determining which method matches the method call. Static Dispatch is computing methods at compile time, whereas dynamic dispatch is doing it at run time.
Is Binding matching up primitive and reference variables with primitive values and objects respectively?
Edit: Please give me some clear reference material so I can read more about this.
I believe the confusion typically comes from how overloaded these terms are.
We program our programs in a high level language, and either a compiler or an interpreter must transform that into something a machine actually understands.
In coarse terms, you can picture a compiler transforming our method code into some form of machine code. If the compiler knew at that point exactly where in the memory that method would reside when we run our program later, then it could safely go and find every method invocation of this compiled method and replace it with a jump to this address where the compiled code resides, right?.
Well, materializing this relationship is what I understand as binding. This binding, though, could happen at different moments, for example at compile time, linking time, load time, or at run time depending on the design of the language.
The terms static and dynamic are generally used to refer to things bound before run time and at run time, respectively.
Later binding times are associated with greater flexibility, earlier binding times are associated with greater efficiency. Language designers have to balance these two aspects when they're creating a language.
Most object-oriented programming languages support subtype polymorphism. In these languages, virtual methods are bound at runtime depending on the dynamic type of the object at that point. In other words, virtual method invocations are dispatched to the appropriate implementation at runtime based on the dynamic type of the object implementation involved and not based solely on its static type reference.
So, in my opinion, you must first bind the method invocation to a specific implementation or execution address, and then you can dispatch an invocation to it.
I had answered a very similar question in the past in which I demonstrate with examples how this happens in Java.
I would also recommend reading the book Programming Language Pragmatics. It is a great reference to learn all this kind of stuff from a theoretical standpoint.
When you're looking for "low level" definitions, probably the only legitimate source is our old friend - the JLS. Though it does not give a clear definition in this case, the context in which it uses each term might be enough.
Dispatch
This term is indeed mentioned in procedures of determining which method to call.
15.12.2. Compile-Time Step 2: Determine Method Signature
The second step searches the type determined in the previous step for
member methods. This step uses the name of the method and the argument
expressions to locate methods that are both accessible and applicable,
that is, declarations that can be correctly invoked on the given
arguments.
There may be more than one such method, in which case the
most specific one is chosen. The descriptor (signature plus return
type) of the most specific method is the one used at run time to
perform the method dispatch. A method is applicable if it is
applicable by one of strict invocation
The elaboration on what is the "most specific" method is done in 15.12.2.5 Choosing the Most Specific Method.
As for "dynamic dispatch",
JLS 12.5. Creation of New Class Instances:
Unlike C++, the Java programming language does not specify altered
rules for method dispatch during the creation of a new class instance.
If methods are invoked that are overridden in subclasses in the object
being initialized, then these overriding methods are used, even before
the new object is completely initialized.
It includes
Example 12.5-2. Dynamic Dispatch During Instance Creation
class Super {
Super() {
printThree();
}
void printThree() {
System.out.println("three");
}
}
class Test extends Super {
int three = 3;
void printThree() {
System.out.println(three);
}
public static void main(String[] args) {
Test t = new Test();
t.printThree();
}
}
Output:
0
3
This happens because during the constructor call chain, Super's constructor calls printThree, but due to dynamic dispatch the method in Test is called, and that is before the field is initialized.
Binding
This term is used in contexts of class member access.
Example 15.11.1-1. Static Binding for Field Access demonstrates early and late bindings. I will summarize the examples given there for the lazy of us:
class S {
int x = 0;
int z() { return x; }
}
class T extends S {
int x = 1;
int z() { return x; }
}
public class Test1 {
public static void main(String[] args) {
S s = new T();
System.out.println("s.x=" + s.x);
System.out.println("s.x=" + s.z());
}
}
Output:
s.x=0
s.x=1
Showing that the field uses "early binding", while the instance method uses "late binding":
This lack of dynamic lookup for field accesses allows programs to be run efficiently with
straightforward implementations. The power of late binding and overriding is available, but
only when instance methods are used.
Binding is also used in regards to determining the type of a generic,
8. Classes
Classes may be generic (§8.1.2), that is, they may declare type variables whose bindings may differ among different instances of the class.
Meaning that if you create 2 instances of List<String>, the bindings of String in both instances are different from each other.
This also applies to raw types:
4.8. Raw Types
class Outer<T>{
T t;
class Inner {
T setOuterT(T t1) { t = t1; return t; }
}
}
The type of the member(s) of Inner depends on the type parameter of Outer. If Outer is raw, Inner must be treated as raw as well, as there is no valid binding for T.
Meaning that declaring Outer outer (this will generate a raw type warning) does not allow to determine the type of T (obviously - it wasn't defined in the declaration).
These are general terms, you can summarize it in this way: when some thing(method or object) is static/early it means that thing is configured in compile-time and there is no ambiguity in run time for example in the following code:
class A {
void methodX() {
System.out.print("i am A");
}
}
If we create an instance of A and call methodX(), nothing is ambitious and everythin is configured at compile time but if we have the following code
class B extends A {
void methodX() {
System.out.print("i am B");
}
}
....
A objX= new B();
objX.methodX();
Out put of method x is not known until runtime, so this method is dynamically binded/dispatched (we can use the term dispatched instead of bind for methods link).
When I want to refer to the method in the current scope I still need
to specify class name (for static methods) or this before ::
operator. For example, I need to write:
import java.util.stream.Stream;
public class StreamTest {
public static int trimmedLength(String s) {
return s.trim().length();
}
public static void main(String[] args) {
System.out.println(Stream.of(" aaa ", " bb ", " c ")
.mapToInt(StreamTest::trimmedLength).sum());
}
}
It's not so big problem for this, but sometimes look overcrowded for static methods as the class name can be quite long. It would be nice if compiler allowed me to write simply ::trimmedLength instead:
public static void main(String[] args) {
System.out.println(Stream.of(" aaa ", " bb ", " c ")
.mapToInt(::trimmedLength).sum());
}
However Java-8 compiler doesn't allow this. For me it seems that it would be quite consistent if class/object name were resolved in the same manner as it's done for normal method call. This would also support static imports for method references which also can be useful in certain cases.
So the question is why such or similar syntax was not implemented in Java 8? Are there any problems which would arise with such syntax? Or it was not simply considered at all?
I can’t speak for the Java developers but there are some things to consider:
There are certain kind of method references:
Reference to a static method, e.g. ContainingClass::staticMethodName
Reference to an instance method of a particular object, e.g. containingObject::instanceMethodName
Reference to an instance method of an arbitrary object of a particular type, e.g. ContainingType::methodName
Reference to a constructor, e.g. ClassName::new
The compiler already has to do some work to disambiguate the forms 1 and 3 and sometimes it fails. If the form ::methodName was allowed, the compiler had to disambiguate between three different forms as it could be any of the three forms from 1 to 3.
That said, allowing the form ::methodName to short-cut any of the form 1 to 3 still wouldn’t imply that it is equivalent to the form methodName(…) as the expression simpleName ( argopt ) may refer to
an instance method in the scope of the current class or its superclasses and interfaces
a static method in the scope of the current class or its superclasses
an instance method in the scope of an outer class or its superclasses and interfaces
a static method in the scope of an outer class or its superclasses
a static method declared via import static
So saying something like “::name should be allowed to refer to any method name(…) may refer to” implies to combine the possibilities of these two listings and you should think twice before making a wish.
As a final note, you still have the option of writing a lambda expression like args -> name(args) which implies resolving name like a simple method invocation of the form name(args) while at the same time solving the ambiguity problem as it eliminates the option 3 of the method reference kinds, unless you write explicitly (arg1, otherargs) -> arg1.name(otherargs).
In Java I can declare a variable, whose name is total same with its classname. I think it is a so confusing and strange design.
So I have a problem in the code snippet below: how can the compiler distinguish the ClassName, it is referenced the variable name or class name?
In the running result, the compiler references ClassName as a variable name.
class ClassName{}
public class Test {
public static void main(String[] args){
ClassName ClassName = new ClassName();
System.out.println(ClassName); //ClassName#18fb53f6
}
}
The compiler can tell by context. In the example you have given:
ClassName ClassName = new ClassName();
1 2 3
It can see that 1 is where a type name should be, so it knows you mean the class. Then, 2 is where a variable name is expected, so it knows that this should be the name of a variable. And 3 is coming after the new keyword with parentheses, so it must be the name of a class.
System.out.println( ClassName );
In this instance, ClassName is in the context of argument passing. A type name can't be passed as an argument, so you must mean the name of the variable.
To amuse yourself, you can change the print statement to:
System.out.println( ClassName.class );
Hover your mouse cursor on ClassName and you'll see that the compiler recognizes this as the name of a class. Then change it to:
System.out.println( ClassName.getClass() );
Hover your cursor again, and now you see that it recognizes it as the variable name. That's because .class can only be applied to a type name, while getClass() can only be applied to an object reference. The result of the print statement would be the same in both cases - but through different mechanisms.
So the compiler has no problem here. But you are right that it's not readable to humans. The convention is that names of variables and methods must start with a lowercase letter, while type names must start with an uppercase letter. Adhering to this convention will ensure that no such readability problems arise.
I can't say exactly why the authors of Java chose not to enforce this convention (that is, give a compiler error if type names started with a lowercase letter or variable/method names started with an uppercase), but I speculate that they didn't want to make anything an actual error unless it would actually cause an ambiguity for the compiler. Compilation errors are supposed to indicate a problem that makes the compiler unable to do its work.
how can the compiler distinguish the "Classname"
Because there are two components: The variable type and variable name. You declare a variable ClassName of type ClassName. Type always goes first. Classes are not first-class objects (meaning you can't have a reference to a class) unless you get into reflections (with the .class property).
Therefore, in the print statement:
System.out.println(ClassName);
That can only be the variable. System.out.println takes an object reference, and you have an object referred to by a variable named ClassName, therefore the compiler can resolve it.
The only case I can think that is ambiguous to the compiler is if the variable refers to an object which has an instance method of the same name as a static method on the class.
public class SomeClass {
public void aMethod() {
System.out.println("A method!");
}
public static void aMethod() {
System.out.println("Static version!");
}
}
public class TestClass {
public static void main (String[] args) {
SomeClass SomeClass = new SomeClass();
SomeClass.aMethod(); // does this call the instance method or the static method?
}
}
I am sure the compiler will detect the ambiguity and handle it in some specified manner (in the Java spec). Probably one of:
Don't allow a static and instance method to have the same name.
Allow it, and when resolving the reference at compile-time, prefer the instance method.
Allow it, and when resolving the reference at compile-time, prefer the static method.
If either of the last 2, I imagine a compiler warning would be logged.
Now that the compiler question is aside, the only other consumer of the code is human beings. Compilers may be able to rely on specifications to guarantee rationale behavior, but humans can't. We get confused easily. The best advice I have for that is simply, don't do it!
There is absolutely no reason to name a variable identically to a class. In fact, most Java coding style conventions I have seen use lowerCamelCase to name variables and methods and UpperCamelCase to name classes, so there is no way for them to collide unless you deviated from the standards.
If I encountered code like that in a project I was working on, I would immediately rename the variable before doing anything else.
For my ambiguous case of an instance and static method of the same name, there just might be a human lesson in there too: don't do it!
Java has a lot of rules to force you to do things that are logical and make code easy to follow, but at the end of the day, it's still code and you can write any code you want. No language spec or compiler can prevent you from writing confusing code.
ClassName ClassName = new ClassName();
If you study compiler design course, you will know there is a Lexical Analysis step. At this step, you will write a grammar for your language. for example:
ClassName variableName = new ClassName();
So example above, compiler can understand second ClassName is variable.
When you do something like:
ClassName.doSomething();
Java will understand ClassName as variable rather than a class. And this design won't have any limitation. doSomething() can be both static method or just a instance method.
If Java understands ClassName here as class, so doSomething() cannot be a instance method. Maybe because this so Java creator has chosen above design: ClassName as a variable.
But what the problem if a variable name cannot be same name with their class. so the following example:
ClassA ClassB = new ClassA();
ClassB.callMethodInClassB(); // should compile error or not ???!!!
The problem still be here. The misleading still exist. So the new design should be:
No variable name should not has same name with **any** class name.
And you will see, this statement makes one language more complicate to understand and not so well-define. From above proofs, I think when you do something such as: A A = new A(); understand A as variable is a best way in language design.
Hope this help :)
OK, here's a very curious Java 7 language puzzle for the JLS specialists out there. The following piece of code won't compile, neither with javac nor with Eclipse:
package com.example;
public class X {
public static X com = new X();
public void x() {
System.out.println(com.example.X.com);
// cannot find symbol ^^^^^^^
}
}
It appears as though the member com completely prevents access to the com.* packages from within X. This isn't thoroughly applied, however. The following works, for instance:
public void x() {
System.out.println(com.example.X.class);
}
My question(s):
How is this behaviour justified from the JLS?
How can I work around this issue
Note, this is just a simplification for a real problem in generated code, where full qualification of com.example.X is needed and the com member cannot be renamed.
Update: I think it may actually be a similar problem like this one: Why can't I "static import" an "equals" method in Java?
This is called obscuring (jls-6.4.2).
A simple name may occur in contexts where it may potentially be
interpreted as the name of a variable, a type, or a package. In these
situations, the rules of §6.5 specify that a variable will be chosen
in preference to a type, and that a type will be chosen in preference
to a package. Thus, it is may sometimes be impossible to refer to a
visible type or package declaration via its simple name. We say that
such a declaration is obscured.
Your attribute com.example.X.com is not static so it can't be accessed via your X class in a static way. You can access it only via an instance of X.
More than that, each time you will instanciate an X, it will lead to a new X : I can predict a memory explosion here.
Very bad code :)
How can I work around this issue?
Using a fully qualified class name here can be a problem because, in general, package names and variable names both start with lower case letters and thus can collide. But, you do not need to use a fully qualified class name to gain a reference a class's static member; you can reference it qualified just by the class name. Since class names should start with an upper case character, they should never collide with a package name or variable. (And you can import an arbitrary class with its fully qualified class name without issue, because the import statement will never confuse a variable name for a package name.)
public void x() {
// System.out.println(com.example.X.com);
// cannot find symbol ^^^^^^^
System.out.println(X.com); // Works fine
}