Dynamically creating a java class through one method's bytecode

Dynamically creating a java class through one method's bytecode - java

I understand if I have a class file I can load it at run time and execute it's methods through classLoader. However, what if I only have bytecode or java code for a single method? Is it possible to dynamically create a class at run time and then invoke the method?

There is a planned feature, JEP 8158765: Isolated Methods, also on the bugtracking list, which would allow to load and execute such bytecode, without generating a fully materialized Class. It could look like
MethodHandle loadCode(String name, MethodType type, byte[] instructions, Object[] constants)
in the class MethodHandles.Lookup
However, this feature is in draft state, so it could take significant time before becoming an actual API and it might even happen that it gets dropped in favor of an entirely different feature covering the use cases the authors of the JEP have in mind.
Until then, there is no way around generating the necessary bytes before and after the method’s bytecode, to describe a complete class and loading that class. But of course, you can write your own method accepting a method’s byte code and some metadata, like the expected signature, generating such a class and reuse that method.
Note that there’s an alternative to creating a new ClassLoader, Class<?> defineClass(byte[] bytes) in class MethodHandles.Lookup which allows to add a class to an existing class loading context, since Java 9.

The bytecode for a method refers to entries in the class's constant pool, so it doesn't make sense in isolation.

Related

ByteBuddy agent to replace one method param with another

I have a large 3rd party code base I can't modify, but I need to make a small but important change in many different places. I was hoping to use a ByteBuddy based agent, but I can't figure out how. The call I need to replace is of the form:
SomeSystemClass.someMethod("foo")
and I need to replace it with
SomeSystemClass.someMethod("bar")
while leaving all other calls to the same method untouched
SomeSystemClass.someMethod("ignore me")
Since SomeSystemClass is a JDK class, I do not want to advise it, but only the classes that contain calls to it. How can this be done?
Note that:
someMethod is static and
the calls (at least some of them) are inside a static initializer block

There are two approaches to this with Byte Buddy:
You transform all classes with the call site in question:
new AgentBuilder.Default()
.type(nameStartsWith("my.lib.pkg."))
.transform((builder, type, loader, module) -> builder.visit(MemberSubstitution.relaxed()
.method(SomeSystemClass.class.getMethod("someMethod", String.class))
.replaceWith(MyAlternativeDispatcher.class.getMethod("substitution", String.class)
.on(any()))
.installOn(...);
In this case, I suggest you to implement a class MyAlternativeDispatcher to your class path (it can also be shipped as part of the agent unless you have a more complex class loader setup such as OSGi where you implement the conditional logic:
public class MyAlternativeDispatcher {
public static void substitution(String argument) {
if ("foo".equals(argument)) {
argument = "bar";
}
SomeSystemClass.someMethod(argument);
}
}
Doing so, you can set break points and implement any complex logic without thinking too much of byte code after setting up the agent. You can, as suggested, even ship the substitution method independently of the agent.
Instrument the system class itself and make it caller sensitive:
new AgentBuilder.Default()
.with(RedefinitionStrategy.RETRANSFORMATION)
.disableClassFormatChanges()
.type(is(SomeSystemClass.class))
.transform((builder, type, loader, module) -> builder.visit(Advice.to(MyAdvice.class).on(named("someMethod").and(takesArguments(String.class)))))
.installOn(...);
In this case, you'd need to reflect on the caller class to make sure you only alter behavior for the classes you want to apply this change for. This is not uncommon within the JDK and since Advice inlines ("copy pastes") the code of your advice class into the system class, you can use the JDK internal APIs without restriction (Java 8 and prior) if you cannot use the stack walker API (Java 9 and later):
class MyAdvice {
#Advice.OnMethodEnter
static void enter(#Advice.Argument(0) String argument) {
Class<?> caller = sun.reflect.Reflection.getCallerClass(1); // or stack walker
if (caller.getName().startsWith("my.lib.pkg.") && "foo".equals(argument)) {
argument = "bar";
}
}
}
Which approach should you choose?
The first approach is probably more reliable but it is rather costly since you have to process all classes in a package or subpackages. If there are many classes in this package you will pay quite a price for processing all these classes to check for relevant call sites and therefore delay application startup. Once all classes are loaded, you have however paid the price and everything is in place without having altered a system class. You do however need to take care of class loaders to make sure that your substitution method is visible to everybody. In the simplest case, you can use the Instrumentation API to append a jar with this class to the boot loader what makes it globally visible.
With the second approach, you only need to (re-)transform a single method. This is very cheap to do but you will add a (minimal) overhead to every call to the method. Therefore, if this method is invoked a lot on a critical execution path, you'd pay a price on every invocation if the JIT does not discover an optimization pattern to avoid it. I'd prefer this approach for most cases, I think, a single transformation is often more reliable and performant.
As a third option, you could also use MemberSubstitution and add your own byte code as a replacement (Byte Buddy exposes ASM in the replaceWith step where you can define custom byte code instead of delegating). This way, you could avoid the requirement of adding a replacement method and just add the substitution code in-place. This does however bear the serious requirement that you:
do not add conditional statements
recompute the stack map frames of the class
The latter is required if you add conditional statements and Byte Buddy (or anybody) cannot optimize it in-method. Stack map frame recomputation is very expensive, fails comparable often and can require class loading locks to dead lock. Byte Buddy optimizes ASM's default recomputation, trying to avoid dead locks by avoiding class loading but there is no guarantee either, so you should keep this in mind.

Reduce visibility of classes and methods

TL;DR: Given bytecode, how can I find out what classes and what methods get used in a given method?
In my code, I'd like to programmatically find all classes and methods having too generous access qualifiers. This should be done based on an analysis of inheritance, static usage and also hints I provide (e.g., using some home-brew annotation like #KeepPublic). As a special case, unused classes and methods will get found.
I just did something similar though much simpler, namely adding the final keyword to all classes where it makes sense (i.e., it's allowed and the class won't get proxied by e.g., Hibernate). I did it in the form of a test, which knows about classes to be ignored (e.g., entities) and complains about all needlessly non-final classes.
For all classes of mine, I want to find all methods and classes it uses. Concerning classes, there's this answer using ASM's Remapper. Concerning methods, I've found an answer proposing instrumentation, which isn't what I want just now. I'm also not looking for a tool like ucdetector which works with Eclipse AST. How can I inspect method bodies based on bytecode? I'd like to do it myself so I can programmatically eliminate unwanted warnings (which are plentiful with ucdetector when using Lombok).

Looking at the usage on a per-method basis, i.e. by analyzing all instructions, has some pitfalls. Besides method invocations, there might be method references, which will be encoded using an invokedynamic instruction, having a handle to the target method in its bsm arguments. If the byte code hasn’t been generated from ordinary Java code (or stems from a future version), you have to be prepared to possibly encounter ldc instructions pointing to a handle which would yield a MethodHandle at runtime.
Since you already mentioned “analysis of inheritance”, I just want to point out the corner cases, i.e. for
package foo;
class A {
public void method() {}
}
class B implements bar.If {
}
package bar;
public interface If {
void method();
}
it’s easy to overlook that A.method() has to stay public.
If you stay conservative, i.e. when you can’t find out whether B instances will ever end up as targets of the If.method() invocations at other places in your application, you have to assume that it is possible, you won’t find much to optimize. I think that you need at least inlining of bridge methods and the synthetic inner/outer class accessors to identify unused members across inheritance relationships.
When it comes class references, there are indeed even more possibilities, to make a per-instruction analysis error prone. They may not only occur as owner of member access instructions, but also for new, checkcast, instanceof and array specific instructions, annotations, exception handlers and, even worse, within signatures which may occur at member references, annotations, local variable debugging hints, etc. The ldc instruction may refer to classes, producing a Class instance, which is actually used in ordinary Java code, e.g. for class literals, but as said, there’s also the theoretical possibility to produce MethodHandles which may refer to an owner class, but also have a signature bearing parameter types and a return type, or to produce a MethodType representing a signature.
You are better off analyzing the constant pool, however, that’s not offered by ASM. To be precise, a ClassReader has methods to access the pool, but they are actually not intended to be used by client code (as their documentation states). Even there, you have to be aware of pitfalls. Basically, the contents of a CONSTANT_Utf8_info bears a class or signature reference if a CONSTANT_Class_info resp. the descriptor index of a CONSTANT_NameAndType_info or a CONSTANT_MethodType_info points to it. However, declared members of a class have direct references to CONSTANT_Utf8_info pool entries to describe their signatures, see Methods and Fields. Likewise, annotations don’t follow the pattern and have direct references to CONSTANT_Utf8_info entries of the pool assigning a type or signature semantic to it, see enum_const_value and class_info_index…

Is there an analogue of visitLdcInsn for loading objects (not constant)?

We wrote a simple PostScript interpreter in Java and want to optimize it by generating bytecode directly for specific parts of source code. For this we need to load the object from the context of the Java bytecode context. Specify such object in the signature of the generated bytecode method is not good, because they may be in a large amount in our case.
In Java Asm we have method
public void visitLdcInsn(Object cst)
It visits a LDC instruction. Parameter cst - the constant to be loaded on the stack.
Is there any way to load not constant object?
Thanks

Since Java 11, it is possible to load arbitrary constants using the LDC instruction. These may be objects of arbitrary type but meant to bear constant semantics, so they should be preferably immutable.
For this to work, the referenced constant pool entry has to be a CONSTANT_Dynamic_info, which has a similar structure as the CONSTANT_InvokeDynamic_info, likewise describing a bootstrap method.
One difference is that the name_and_type_index entry of the dynamic info structure will point to a field descriptor. Further, the bootstrap method has a signature of (MethodHandles.Lookup,String,Class[,static arguments]) having a Class argument representing the expected type of the constant, rather than a MethodType object. The bootstrap method has to directly return the constant value rather than a call-site.
Common to the invokedynamic instruction is that the result of the first bootstrapping process will get associated with the LDC instruction and used in all subsequent executions (as it is supposed to be a constant).
An interesting property of these dynamic constants is that they are valid static arguments to the bootstrap method for another dynamic constant or an invokedynamic instruction (as long as there is no cyclic dependency between the dynamic constants).
Note that there is already a convenience class containing some ready-to-use bootstrap methods for dynamic constants.

ldc can be used for loading values of type int, float, String, Class, MethodType or MethodHandle; ldc2_w supports values of type long and double. 1
As said, within Oracle’s JVM implementation there is the internally used Unsafe API which allows patching in runtime objects as replacements for constants but that has several drawbacks. First, it’s obviously not part of the official API, not present in every JVM and might even disappear (or change method signatures) in future Oracle JVMs. Further, the ASM framework will not be aware of what you are going to do and have difficulties to generate the appropriate bytecode for later-on patches.
After all, it’s not clear, what the advantage of abusing ldc for a runtime object in your project shall be. Generating the code for passing the instance as method or constructor parameter and storing an object in a field is not very complicated with ASM. And for the program logic, it doesn’t matter whether you use ldc or, e.g. getstatic, right before using the value.

As the bad way of using the Unsafe was pointed out (it is not really an option either as it requires you to load the classes anonymously):
I assume that you are creating a class during build time but you want to inject some sort of runtime context into these classes which are required for running your instrumentation. You can at least emulate this by writing a specialized ClassLoader for your application which is aware of this context and which explicitly initializes a class by for example an annotation.
This means you instrument a class such as:
#Enhanced
class Foo {
static EnhancementDelegate delegate;
void instrumentedMethod() {
// do something with delegate
}
}
at build-time and you initialize is explicitly at load time:
class EnhancementClassLoader extends ClassLoader {
#Override
protected Class<?> loadClass(String name) {
Class<?> clazz = super.loadClass(name);
if(clazz.isAnnotationPresent(Enhanced.class)) {
// do initialization stuff
}
return clazz;
}
}
Would this help you out? It is kind of a guess what you are trying to achieve but I think this might be a good solution. Check out my project Byte Buddy which solves a similar problem for proxy classes by introducing a LoadedTypeInitializer.

java generics vs dynamically loading class using Class.forName()

Assume I am making a class called Government. Government has members like officers, ministers, departments etc. For each of those members I create an interface, and any specific government defines them as they like.
The main method in the Government class is called Serve(Request req). Assume that the query rate is very large (1000+ queries per second).
To create the government, I can:
1) Use Java generics to write Government<Class Minister, Class Officer, ...> and any specific government implementation needs to create its own Government object in java code, and a main() to have deployable jar.
2) Have a configuration file which specifies the class names of officers, ministers etc., and whenever Serve() is called, it uses Class.forName()and Class.newInstance() to create an object of the class. Any new government just needs to write classes for its members, and the configuration file. There is one single main() for all governments.
From a purely performance point of view - which is better and why? My main concerns are:
a) does forName() execute a costly search every time? Assume a very large universe of classes.
b) Do we miss out on the compiler optimizations that might be performed in case 1 but not in case 2 on the dynamic classes?

As long as you reuse your goverment object, there is no difference at runtime. Difference isonly at object creation time.
1 & 2 differ in concept - 1 ist hardwired, while 2 is dynamic ( you may even use DI contrainers like spring, guice or pico - basically you proposed to write your own )
As for forName() performance - it is up on classloader ( and also on container ) . MOst of them would cache name resolution results, look up in map - but I can not speak for all
As for optimisations - there are compiler optimisations, and also agressive runtime optiomisations from JIT compilers - they matter more.

I don't get it. Those two are not alternatives; they are pretty much orthogonal: Generics is a compile-time construct. It is erased and does not translate to anything at runtime. On the other hand, loading classes by calling forName is a runtime thing. One does not affect the other.
If you are using generics, then that means you don't need the class object at runtime, since in generics you don't have access to the class object, unless you pass it in explicitly. If you don't need the class object at runtime, that means you don't need to load it with forName, so that is inconsistent with forName. If you do pass the class object in explicitly, then that means you already have the class object and don't need to load it, also inconsistent with forName.

Your description kind of reads like this to me: "I want to use dependency injection, should I roll my own?"
Look into Spring (or Guice by Google). I'll assume Spring.
Create interfaces for stuff and configure which implementation to use for each in Spring.

Java reflection run-time performance

This is an academical exercise (disclaimer).
I'm building an application that will profit from being as fast as possible as it will compete with others.
I know that declaring a class using reflection (example below) will suffer a huge penalty as opposed to a standard declaration.
Class mDefinition = Class.forName("MySpecialClassString");
Constructor mConstructor = mDefinition.getConstructor(new Class[]{MySpecialClass.class});
myClass = (MySpecialClass) mConstructor.newInstance(this);
However, after declaring myClass if I use it in a standard fashion myClass.myMethod() will I also suffer from performance hogs or will it be the same as if I had declared the class in a standard fashion?

There will be a performance penalty when you first instantiate the object. Once the class is loaded, its the same as if it had been instantiated normally, and there will be no further performance penalty.
Going further, if you call methods using reflection, there will be a performance penalty for about fifteen times (default in Java), after which the reflected call will be rewritten by the JVM to be the exact same as a statically compiled call. Therefore, even repeatedly reflected method calls will not cause a performance decrease, once that bytecode has been recompiled by the JVM.
See these two link for more information on that:
Any way to further optimize Java reflective method invocation?
http://inferretuation.blogspot.com/2008/11/in-jit-we-trust.html

Once the class has been loaded you should be fine. The overhead is associated with inspecting the runtime structures that represent the class, etc. Calling methods in a standard fashion should be fine, however if you start searching for methods by name or signature, that will incur additional overhead.

Chris Thompson's answer is on the mark. However I'm confused by your code example.
this will dynamically load the class:
Class mDefinition = Class.forName("MySpecialClassString");
This will get a Contructor for your class, which takes an instance of that same class as an argument. Also note that you're accessing the class at compile time with MySpecialClass.class:
Constructor mConstructor = mDefinition.getConstructor(new Class[]{MySpecialClass.class});
This is instantiating a MySpecialClass by passing this into the constructor:
myClass = (MySpecialClass) mConstructor.newInstance(this);
Based on the constructor argument, does that mean we are in an instance method of MySpecialClass? Very confused.
EDIT: This is closer to what I would have expected to see:
Class<?> mDefinition = Class.forName("MySpecialClassString");
//constructor apparently takes this as argument
Class<?> constructorArgType = this.getClass(); //could be ThisClassName.class
Constructor<?> mConstructor = mDefinition.getConstructor(constructorArgType);
MySpecialInterface mySpecialInstance = (MySpecialInterface)mConstructor.newInstance(this);
where MySpecialInterface is an interface used to interact with your dynamically loaded classes:
interface MySpecialInterface {
//methods used to interface with dynamically loaded classes
}
Anyway please let me know if I'm misunderstanding or off base here.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Dynamically creating a java class through one method's bytecode - java

I understand if I have a class file I can load it at run time and execute it's methods through classLoader. However, what if I only have bytecode or java code for a single method? Is it possible to dynamically create a class at run time and then invoke the method?

The bytecode for a method refers to entries in the class's constant pool, so it doesn't make sense in isolation.

Related

ByteBuddy agent to replace one method param with another

Reduce visibility of classes and methods

Is there an analogue of visitLdcInsn for loading objects (not constant)?

java generics vs dynamically loading class using Class.forName()

Java reflection run-time performance

Categories

Resources