Is there a way to statically reference a method for reflection in Java. Here's some example code to give you an idea of what I am attempting:
public void myFunc(int x) { ... }
public void other() {
Method m1 = getClass().getMethod("myFunc"); // dynamic
Method m2 = this.myFunc; // static
Method m3 = MyClass.myFunc; // static (alternate)
}
I recognize that the above syntax does not work, but I was wondering if there is some sort of syntax similar to this that actually does work. I want a way to use reflection without worrying about the inherent dangers of referencing a method by a string.
Is there a way to do this, or is it just a pipe-dream?
Method references explains
this method to compare the birth dates of two Person instances already exists as Person.compareByAge. You can invoke this method instead in the body of the lambda expression:
Arrays.sort(rosterAsArray,
(a, b) -> Person.compareByAge(a, b)
);
Because this lambda expression invokes an existing method, you can use a > method reference instead of a lambda expression:
Arrays.sort(rosterAsArray, Person::compareByAge);
and it goes on to explain the various kinds of method references:
There are four kinds of method references:
Reference to a static method ContainingClass::staticMethodName
Reference to an instance method
of a particular object containingObject::instanceMethodName
Reference to an instance method ContainingType::methodName
of an arbitrary object of a
particular type
Reference to a constructor ClassName::new
HISTORICAL NOTE (Written before Java 8 was finalized)
I think the Java closures proposal has something like this. Stephen Colebourne says:
Stefan and I are pleased to announce the release of v0.4 of the First-class Methods: Java-style closures proposal.
Changes
Since v0.3, we have tried to incorporate some of the feedback received on the various forums. The main changes are as follows:
1) Constructor and Field literals. It is now possible to create type-safe, compile-time changed instances of java.lang.reflect.Constructor and Field using FCM syntax:
// method literal:
Method m = Integer#valueOf(int);
// constructor literal:
Constructor<Integer> c = Integer#(int);
// field literal:
Field f = Integer#MAX_VALUE;
but I don't think this syntax is available in any shipping JVM. Closures themselves are definitely not in Java 7. You might see it in Java 8.
The Java closures site has a pointer to "Method references" which is a bit more up-to-date though it doesn't look like they've changed the syntax much.
JSR-335 is what you're looking for. Hopefully it will be available in JDK 8.
Related
Would appreciate any help on understanding below two concepts in Java 8.
What I know
A lambda is an object without an identity and should not be used as a regular object.
A lambda expression should not be calling methods from Object class like toString, equals, hashCode etc.
What I'd like know more about
What difference does lambda expression have to be called as an object without identity?
Why exactly should methods from Objectclass not be used while using lambda expression?
1) A lambda is an object without an identify and should not be used as a regular object.
This is not true. A lambda is a Java object, and it does have an identity1. The problem is that the lifecycle is deliberately left specified to allow Java compilers freedom to optimize the code that evaluates them. Therefore, you cannot know when new lambda objects are going to be created, or existing ones are going to be reused.
JLS 15.27.4 says:
"At run time, evaluation of a lambda expression is similar to evaluation of a class instance creation expression, insofar as normal completion produces a reference to an object. Evaluation of a lambda expression is distinct from execution of the lambda body.
Either a new instance of a class with the properties below is allocated and initialized, or an existing instance of a class with the properties below is referenced. ... "
2) A lambda expression should not be calling methods from Object class like toString, equals, hashCode etc.
As written, that is also not true. A lambda expression may call those methods. However, it is not advisable to rely on those methods to have any specific behavior when you call them on a lambda object
The JLS states:
"The class ... may override methods of the Object class."
In other words, it may ... or may not ... override them. If you rely a particular behavior of these methods, your application is (in theory) non-portable.
Furthermore, since the instantiation semantics are also unspecified, the behavior of Object::equals and Object::hashCode are uncertain.
Finally, it is unspecified whether lambdas are clonable.
1 - Sure, a lambda doesn't have a name: it is anonymous. But name and identity are different concepts.
Basically, a lambda is a convenience of doing this:
#FunctionalInterface
interface A {
void b();
}
void c(A a) {
...
}
void main() {
c(new A() {
void b() {
...
}
});
}
I apologize for the less than stellar variable names, but as you can see, A is an interface with one method. c is a method that takes in an A interface. However, instead of creating your own class that implements A, you can create it on the spot. This is what you call an anonymous class, since it doesn't have a name. This is where the quote you have:
A lambda is an object without an identify
comes from. The class doesn't have an identity. The reason it relates to lambdas is because if an interface has only one method, you can use lamdas to simplify it.
void main() {
c(
() -> {
...
}
);
}
This is the exact same as before.
The second part, why lambdas shouldn't use Object's methods, I didn't know before. You should probably have someone else answer this, however my guess is that lambda classes don't look like it extends Object directly, so you can't use it's methods.
This question already has answers here:
Lambda expression vs method reference implementation details
(3 answers)
Closed 5 years ago.
I just read in a book that when a lambda expression is assigned to a functional interface, then that sets the "target type" for the lambda and an instance of that type (that is, the functional interface's type) is created with the lambda expression used as implementation for the abstract method in the functional interface.
My question: If so, then does that mean lambdas aren't really standalone methods and as such a new type of element brought into the language, but are simply a more compact way for expressing an anonymous class and as such merely are added facility (just like generics) on the compiler's side?
Moreover, how do method references comply with that, in particular, static methods which are not associated with any objects? For example, when a method reference to an instance method is assigned to a functional interface then the encapsulating object for that method is used, but what happens in the case of a static method - those are not associated with any object.. ?
If so, then does that mean lambdas aren't really standalone methods and as such a new type of element brought into the language,
Correct, lambdas are compiled into normal methods with a synthetic name
but are simply a more compact way for expressing an anonymous class and as such merely are added facility (just like generics) on the compiler's side?
No, it's not only on the compiler side. There are is also code in the JVM involved, so that the compiler doesn't have to write class files for the lambdas.
Moreover, how do method references comply with that, in particular, static methods which are not associated with any objects?
Method references are not different from lambdas: at runtime there has to be an object implementing the functional interface. Upon calling the "SAM" of the object this method will call the referenced method.
For example, when a method reference to an instance method is assigned to a functional interface then the encapsulating object for that method is used,
No, it can't be used. Let's take the following example using a System.out::println method reference:
Arrays.asList("A", "B").forEach(System.out::println);
List<E>.forEach() expects a Consumer<? super E> which defines the method void accept(E e). The compiler need to generate byte code and other information in the class file so that at runtime the JVM can generate a class implementing Consumer<E> with a method void accept(E e). This generated method then calls System.out.println(Object o).
The runtime generated class would look something like
class $$lambda$xy implements Consumer<Object> {
private PrintStream out;
$$lambda$xy(PrintStream out) {
this.out = out;
}
void accept(Object o) {
out.println(o);
}
}
Your question from the comment: "Why not directly assign to instance and its method?"
Let's expand the example a little bit:
static void helloWorld(Consumer<String> consumer) {
consumer.apply("Hello World!");
}
public static void main(String[] args) {
helloWorld(System.out::println);
}
To compile this, the compiler has to generate bytecode that creates an object implementing Consumer<String> (so it can pass the object into helloWorld()). That object somehow has to store the information that upon calling it's accept(x) method it has to call println(x) on the System.out PrintStream.
Other languages may have other names or concepts for this kind of objects - in Java the established concept is "an anonymous class implementing the interface and an object of that anonymous class".
How does the object store this information? Well, you could invent some super cool new way to store this information. The Java Language designers decided that an anonymous class would be good enough - for the time being. But they had the foresight that if someone came along with a new idea to implement it in a more efficient way, this should be easy to integrate into the Java ecosystem (Java compiler and JVM).
So they also decided to create that anonymous class not at compile time but to let the compiler just write the necessary information into the class file. Now the JVM can at runtime decide on what the optimal way to store the information (calling the correct method on the correct object) is.
For example, when a method reference to an instance method is assigned
to a functional interface then the encapsulating object for that
method is used, but what happens in the case of a static method -
those are not associated with any object..
That depends on context. Let say we have a static Utils#trim(String) method that will obviously trim given string.
And now, lest have a List<String> list and lets have some strings in it. We can do something like this:
list.stream().map(Utils::trim).collect(Collectors.toList());
As you can see, in given context, we are using lambda static method reference in order to use every string in list as input argument of Utils::trim method.
There are too many associated names: Early and Late Binding, Static and Dynamic Dispatch, Runtime vs. Compile-time Polymorphism, etc. that I don't understand the difference.
I found a clear explanation, but is it correct? I'll paraphrase JustinC:
Binding: is determining the type of a variable (object?). If it's done at compile time, its early binding. If it's done at run time, it's late binding.
Dispatch: is determining which method matches the method call. Static Dispatch is computing methods at compile time, whereas dynamic dispatch is doing it at run time.
Is Binding matching up primitive and reference variables with primitive values and objects respectively?
Edit: Please give me some clear reference material so I can read more about this.
I believe the confusion typically comes from how overloaded these terms are.
We program our programs in a high level language, and either a compiler or an interpreter must transform that into something a machine actually understands.
In coarse terms, you can picture a compiler transforming our method code into some form of machine code. If the compiler knew at that point exactly where in the memory that method would reside when we run our program later, then it could safely go and find every method invocation of this compiled method and replace it with a jump to this address where the compiled code resides, right?.
Well, materializing this relationship is what I understand as binding. This binding, though, could happen at different moments, for example at compile time, linking time, load time, or at run time depending on the design of the language.
The terms static and dynamic are generally used to refer to things bound before run time and at run time, respectively.
Later binding times are associated with greater flexibility, earlier binding times are associated with greater efficiency. Language designers have to balance these two aspects when they're creating a language.
Most object-oriented programming languages support subtype polymorphism. In these languages, virtual methods are bound at runtime depending on the dynamic type of the object at that point. In other words, virtual method invocations are dispatched to the appropriate implementation at runtime based on the dynamic type of the object implementation involved and not based solely on its static type reference.
So, in my opinion, you must first bind the method invocation to a specific implementation or execution address, and then you can dispatch an invocation to it.
I had answered a very similar question in the past in which I demonstrate with examples how this happens in Java.
I would also recommend reading the book Programming Language Pragmatics. It is a great reference to learn all this kind of stuff from a theoretical standpoint.
When you're looking for "low level" definitions, probably the only legitimate source is our old friend - the JLS. Though it does not give a clear definition in this case, the context in which it uses each term might be enough.
Dispatch
This term is indeed mentioned in procedures of determining which method to call.
15.12.2. Compile-Time Step 2: Determine Method Signature
The second step searches the type determined in the previous step for
member methods. This step uses the name of the method and the argument
expressions to locate methods that are both accessible and applicable,
that is, declarations that can be correctly invoked on the given
arguments.
There may be more than one such method, in which case the
most specific one is chosen. The descriptor (signature plus return
type) of the most specific method is the one used at run time to
perform the method dispatch. A method is applicable if it is
applicable by one of strict invocation
The elaboration on what is the "most specific" method is done in 15.12.2.5 Choosing the Most Specific Method.
As for "dynamic dispatch",
JLS 12.5. Creation of New Class Instances:
Unlike C++, the Java programming language does not specify altered
rules for method dispatch during the creation of a new class instance.
If methods are invoked that are overridden in subclasses in the object
being initialized, then these overriding methods are used, even before
the new object is completely initialized.
It includes
Example 12.5-2. Dynamic Dispatch During Instance Creation
class Super {
Super() {
printThree();
}
void printThree() {
System.out.println("three");
}
}
class Test extends Super {
int three = 3;
void printThree() {
System.out.println(three);
}
public static void main(String[] args) {
Test t = new Test();
t.printThree();
}
}
Output:
0
3
This happens because during the constructor call chain, Super's constructor calls printThree, but due to dynamic dispatch the method in Test is called, and that is before the field is initialized.
Binding
This term is used in contexts of class member access.
Example 15.11.1-1. Static Binding for Field Access demonstrates early and late bindings. I will summarize the examples given there for the lazy of us:
class S {
int x = 0;
int z() { return x; }
}
class T extends S {
int x = 1;
int z() { return x; }
}
public class Test1 {
public static void main(String[] args) {
S s = new T();
System.out.println("s.x=" + s.x);
System.out.println("s.x=" + s.z());
}
}
Output:
s.x=0
s.x=1
Showing that the field uses "early binding", while the instance method uses "late binding":
This lack of dynamic lookup for field accesses allows programs to be run efficiently with
straightforward implementations. The power of late binding and overriding is available, but
only when instance methods are used.
Binding is also used in regards to determining the type of a generic,
8. Classes
Classes may be generic (§8.1.2), that is, they may declare type variables whose bindings may differ among different instances of the class.
Meaning that if you create 2 instances of List<String>, the bindings of String in both instances are different from each other.
This also applies to raw types:
4.8. Raw Types
class Outer<T>{
T t;
class Inner {
T setOuterT(T t1) { t = t1; return t; }
}
}
The type of the member(s) of Inner depends on the type parameter of Outer. If Outer is raw, Inner must be treated as raw as well, as there is no valid binding for T.
Meaning that declaring Outer outer (this will generate a raw type warning) does not allow to determine the type of T (obviously - it wasn't defined in the declaration).
These are general terms, you can summarize it in this way: when some thing(method or object) is static/early it means that thing is configured in compile-time and there is no ambiguity in run time for example in the following code:
class A {
void methodX() {
System.out.print("i am A");
}
}
If we create an instance of A and call methodX(), nothing is ambitious and everythin is configured at compile time but if we have the following code
class B extends A {
void methodX() {
System.out.print("i am B");
}
}
....
A objX= new B();
objX.methodX();
Out put of method x is not known until runtime, so this method is dynamically binded/dispatched (we can use the term dispatched instead of bind for methods link).
I am trying to get a simple java reflection program working in Scala, and seem to be missing something ...
scala> val cl = new URLClassLoader(Array(new File("Hi.jar").toURI.toURL), getClass.getClassLoader)
cl: java.net.URLClassLoader = java.net.URLClassLoader#3c7b137a
scala> val c = cl.loadClass("Hi")
c: Class[_] = class Hi
scala> val m = c.getMethod("run")
m: java.lang.reflect.Method = public void Hi.run()
scala> m.invoke()
<console>:21: error: not enough arguments for method invoke: (x$1: Any, x$2: Object*)Object.
Unspecified value parameters x$1, x$2.
m.invoke()
^
What am I missing, as the prior line has indicated -
public void Hi.run()
What exactly is it expecting for the two arguments?
Scala is telling you exactly what your problem is: invoke needs 1+ parameters!
See the java doc:
invoke(Object obj, Object... args)
Invokes the underlying method represented by this Method object, on the specified object with the specified parameters.
So, you have to provide at least one argument - a reference to the object (or class) you want to call that method on! As Hi.run() seems to be static, you would want to use your c as only argument to your call.
The following arguments would be the actual parameters that your "reflected" method expects. In your case, no further arguments.
Long story short: you better keep the excellent tutorials from Oracle on reflection close to your scala console while experimenting. If you try to learn "reflection" by trial&error; I guarantee you: a lot of frustrating trials with many strange errors. Really: the reflection API is not very forgiving when you don't know what you are doing; even the slightest mistakes can lead to very unexpected results.
There is nothing specific to Scala there. Method.invoke requires the at least one argument being the instance on which it's applied (or null for a static method).
In Scala, you can use structural typing for such simple case.
Consider this simple Java class:
class MyClass {
public void bar(MyClass c) {
c.foo();
}
}
I want to discuss what happens on the line c.foo().
Original, Misleading Question
Note: Not all of this actually happens with each individual invokevirtual opcode. Hint: If you want to understand Java method invocation, don't read just the documentation for invokevirtual!
At the bytecode level, the meat of c.foo() will be the invokevirtual opcode, and, according to the documentation for invokevirtual, more or less the following will happen:
Look up the foo method defined in compile-time class MyClass. (This involves first resolving MyClass.)
Do some checks, including: Verify that c is not an initialization method, and verify that calling MyClass.foo wouldn't violate any protected modifiers.
Figure out which method to actually call. In particular, look up c's runtime type. If that type has foo(), call that method and return. If not, look up c's runtime type's superclass; if that type has foo, call that method and return. If not, look up c's runtime type's superclass's superclass; if that type has foo, call that method and return. Etc.. If no suitable method can be found, then error.
Step #3 alone seems adequate for figuring out which method to call and verifying that said method has the correct argument/return types. So my question is why step #1 gets performed in the first place. Possible answers seem to be:
You don't have enough information to perform step #3 until step #1 is complete. (This seems implausible at first glance, so please explain.)
The linking or access modifier checks done in #1 and #2 are essential to prevent certain bad things from happening, and those checks must be performed based on the compile-time type, rather than the run-time type hierarchy. (Please explain.)
Revised Question
The core of the javac compiler output for the line c.foo() will be an instruction like this:
invokevirtual i
where i is an index to MyClass' runtime constant pool. That constant pool entry will be of type CONSTANT_Methodref_info, and will indicate (maybe indirectly) A) the name of the method called (i.e. foo), B) the method signature, and C) the name of compile time class that the method is called on (i.e. MyClass).
The question is, why is the reference to the compile-time type (MyClass) needed? Since invokevirtual is going to do dynamic dispatch on the runtime type of c, isn't it redundant to store the reference to the compile-time class?
It is all about performance. When by figuring out the compile-time type (aka: static type) the JVM can compute the index of the invoked method in the virtual function table of the runtime type (aka: dynamic type). Using this index step 3 simply becomes an access into an array which can be accomplished in constant time. No looping is needed.
Example:
class A {
void foo() { }
void bar() { }
}
class B extends A {
void foo() { } // Overrides A.foo()
}
By default, A extends Object which defines these methods (final methods omitted as they are invoked via invokespecial):
class Object {
public int hashCode() { ... }
public boolean equals(Object o) { ... }
public String toString() { ... }
protected void finalize() { ... }
protected Object clone() { ... }
}
Now, consider this invocation:
A x = ...;
x.foo();
By figuring out that x's static type is A the JVM can also figure out the list of methods that are available at this call site: hashCode, equals, toString, finalize, clone, foo, bar. In this list, foo is the 6th entry (hashCode is 1st, equals is 2nd, etc.). This calculation of the index is performed once - when the JVM loads the classfile.
After that, whenever the JVM processes x.foo() is just needs to access the 6th entry in the list of methods that x offers, equivalent to x.getClass().getMethods[5], (which points at A.foo() if x's dynamic type is A) and invoke that method. No need to exhaustively search this array of methods.
Note that the method's index, remains the same regardless of the dynamic type of x. That is: even if x points to an instance of B, the 6th methods is still foo (although this time it will point at B.foo()).
Update
[In light of your update]: You're right. In order to perform a virtual method dispatch all the JVM needs is the name+signature of the method (or the offset within the vtable). However, the JVM does not execute things blindly. It first checks that the cassfiles loaded into it are correct in a process called verification (see also here).
Verification expresses one of the design principles of the JVM: It does not rely on the compiler to produce correct code. It checks the code itself before it allows it to be executed. In particular, the verifier checks that every invoked virtual method is actually defined by the static type of the receiver object. Obviously, the static type of the receiver is needed to perform such a check.
That's not the way I understand it after reading the documentation. I think you have steps 2 and 3 transposed, which would make the whole series of events more logical.
Presumably, #1 and #2 have already happened by the compiler. I suspect that at least part of the purpose is to make sure that the they still hold with the version of the class in the runtime environment, which may be different from the version the code was compiled against.
I haven't digested the invokevirtual documentation to verify your summary, though, so Rob Heiser could be right.
I'm guessing answer "B".
The linking or access modifier checks done in #1 and #2 are essential to prevent certain bad things from happening, and those checks must be performed based on the compile-time type, rather than the run-time type hierarchy. (Please explain.)
#1 is described by 5.4.3.3 Method Resolution, which makes some important checks. For example, #1 checks the accessibility of the method in the compile-time type and may return an IllegalAccessError if it is not:
...Otherwise, if the referenced method is not accessible (§5.4.4) to D, method resolution throws an IllegalAccessError. ...
If you only checked the run-time type (via #3), then the run-time type could illegally widen the accessibility of the overridden method (a.k.a. a "bad thing"). Its true that the compiler should prevent such a case, but the JVM is nevertheless protecting itself from rogue code (e.g. manually-constructed malevolent code).
To totally understand this stuff, you need to understand how method resolution works in Java. If you're looking for an in-depth explanation, I suggest looking at the book, "Inside the Java Virtual Machine". The following sections from Chapter 8, "The Linking Model", are available online and seem particularly relevant:
Chapter introduction
Resolution of CONSTANT_Methodref_info Entries
Direct References
(CONSTANT_Methodref_info entries are entries in the class file header that describe the methods called by that class.)
Thanks to Itay for inspiring me to do the Googling required to find this.