When trying to inject a class which is in the java.lang namespace via java.lang.instrument.Instrumentation#appendToBootstrapClassLoaderSearch on a OpenJDK 11, nothing happens and no error is thrown. When placing the class to inject into a different package, it works as expected.
JarFile jar = new JarFile(new File("file/to/bootstrap.jar));
instrumentation.appendToBootstrapClassLoaderSearch(jar);
// throws ClassNotFoundException java/lang/Dispatcher
Class.forName("java.lang.Dispatcher", false, null);
bootstrap.jar
└─ java/lang/Dispatcher.class
The reason I want to do this is to overcome issues with some OSGi containers. They typically restrict delegation to the bootstrap class loader to only certain packages. By default that obviously always includes java.* which is why I want to put my Dispatcher class there. I'm aware of org.osgi.framework.bootdelegation but that property only gets read during initialization. That means when attaching an agent at runtime, it's already too late to override this value.
An alternative would be to instrument all known OSGi class loaders and to white-list the agent classes. But doing that for each framework and test that for each version seems less feasible.
How can I inject a custom class like java.lang.Dispatcher into the bootstrap class loader? Are there other patterns or best practices to avoid OSGi bootdelegation issues?
To provide some more context:
My idea is to only inject this one Dispatcher class into the bootstrap class loader. The dispatcher basically just holds a static Map. The rest of the agent's classes would be loaded by a dedicated URLClassLoader which is a child of the bootstrap class loader. The agent would then register MethodHandles in the dispatcher's Map so that the injected byte code can get ahold of the MethodHandles which enable accessing the agent's classes loaded in the agent class loader.
It is possible by using unsafe API. Since Java 9, the boot class loader's implementation has changed to only check a designated jmod for a known package, but the boot search path is no longer checked.
Java 11 also removed the sun.misc.Unsafe#defineClass method but the same method is still available in jdk.internal.misc.Unsafe.
You do have to open that class's module which is internal. You can either do so by using sun.misc.Unsafe which allows you to write a field value (accessible) without accessibility checks or by using Instrumentation's official API.
If you are using Byte Buddy, have a look at the ClassInjector implementations which offer implementations for all approaches.
There is an open ticket for adressing the need of Java agents to inject helper classes but until it is resolved, this is a common workaround.
Related
I'm currently working on a project, trying to insert additional Log4j logging statements into a running webapp. To realise that, I start a Java agent via
JVM parameter when launching WildFly:
-javaagent:path/to/agent.jar
The agent's premain method receives the Instrumentation object and establishes a MBean for remote access. The logging insertion is achieved using Instrumentation and Javassist. So far, this works perfectly.
However, to keep that working, the agent.jar also has to reside in the webapp's WAR file on deployment, since the log4j Logger class used for logging ships with this JAR. If not, I get a VerifyError when the class definition is updated by Instrumentation API. But trying to load e.g. classes from java.lang by inserting code like "Math.random()" works as expected.
It's important to notice that the agent classes are loaded with an AppClassLoader which is also the parent of the application's ModuleClassLoader.
Therefore I'm wondering why classes residing in agent.jar can't be loaded by delegation through the ModuleClassLoader.
These observation brought me to the assumption that the webapp module needs to declare an explicit dependency on external JARs, even if the classes are known to the parent AppClassLoader. For security issues this would make sense to me.
Can anyone confirm these assumptions or does anyone have another idea or experience what causes this behavior?
Thanks!
---------------- EDIT ---------------------
Playing around with the WildFly classloading mechanism helped me to describe my problem more in detail. Assuming I want to load a class named com.example.LoggerClass (residing in agent.jar!) using the ModuleClassLoader in a ManagedBean belonging to my webapp:
Class<?> aClass = this.getClass().getClassLoader().loadClass("com.example.LoggerClass");
This results in a ClassNotFoundExxception!
But delegating this to the underlying AppClassLoader manually works perfectly:
Class<?> aClass = this.getClass().getClassLoader().getParent().loadClass("com.example.LoggerClass");
Now the JBoss docs concerning ModuleClassLoader's loadClass method tell the following:
Find a class, possibly delegating to other loader(s)
This may explain the behavior I showed above, assuming that ModuleClassLoader does not delegate class loading because of security issues. Is there any way to override this and make ModuleClassLoader delegate to AppClassLoader in certain cases?
An Java agent is always loaded by the system class loader. All of its dependencies must be available on the class path. If you are howver experienceing a verifier error, your byte code is illegal as verification happens before loading completes. This means, your problem is not class loader related.
What is the difference between a thread's context class loader and a normal class loader?
That is, if Thread.currentThread().getContextClassLoader() and getClass().getClassLoader() return different class loader objects, which one will be used?
This does not answer the original question, but as the question is highly ranked and linked for any ContextClassLoader query, I think it is important to answer the related question of when the context class loader should be used. Short answer: never use the context class loader! But set it to getClass().getClassLoader() when you have to call a method that is missing a ClassLoader parameter.
When code from one class asks to load another class, the correct class loader to use is the same class loader as the caller class (i.e., getClass().getClassLoader()). This is the way things work 99.9% of the time because this is what the JVM does itself the first time you construct an instance of a new class, invoke a static method, or access a static field.
When you want to create a class using reflection (such as when deserializing or loading a configurable named class), the library that does the reflection should always ask the application which class loader to use, by receiving the ClassLoader as a parameter from the application. The application (which knows all the classes that need constructing) should pass it getClass().getClassLoader().
Any other way to obtain a class loader is incorrect. If a library uses hacks such as Thread.getContextClassLoader(), sun.misc.VM.latestUserDefinedLoader(), or sun.reflect.Reflection.getCallerClass() it is a bug caused by a deficiency in the API. Basically, Thread.getContextClassLoader() exists only because whoever designed the ObjectInputStream API forgot to accept the ClassLoader as a parameter, and this mistake has haunted the Java community to this day.
That said, many many JDK classes use one of a few hacks to guess some class loader to use. Some use the ContextClassLoader (which fails when you run different apps on a shared thread pool, or when you leave the ContextClassLoader null), some walk the stack (which fails when the direct caller of the class is itself a library), some use the system class loader (which is fine, as long as it is documented to only use classes in the CLASSPATH) or bootstrap class loader, and some use an unpredictable combination of the above techniques (which only makes things more confusing). This has resulted in much weeping and gnashing of teeth.
When using such an API, first, try to find an overload of the method that accepts the class loader as a parameter. If there is no sensible method, then try setting the ContextClassLoader before the API call (and resetting it afterwards):
ClassLoader originalClassLoader = Thread.currentThread().getContextClassLoader();
try {
Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
// call some API that uses reflection without taking ClassLoader param
} finally {
Thread.currentThread().setContextClassLoader(originalClassLoader);
}
Each class will use its own classloader to load other classes. So if ClassA.class references ClassB.class then ClassB needs to be on the classpath of the classloader of ClassA, or its parents.
The thread context classloader is the current classloader for the current thread. An object can be created from a class in ClassLoaderC and then passed to a thread owned by ClassLoaderD. In this case the object needs to use Thread.currentThread().getContextClassLoader() directly if it wants to load resources that are not available on its own classloader.
There is an article on infoworld.com that explains the difference
=> Which ClassLoader should you use
(1)
Thread context classloaders provide a
back door around the classloading
delegation scheme.
Take JNDI for instance: its guts are
implemented by bootstrap classes in
rt.jar (starting with J2SE 1.3), but
these core JNDI classes may load JNDI
providers implemented by independent
vendors and potentially deployed in
the application's -classpath. This
scenario calls for a parent
classloader (the primordial one in
this case) to load a class visible to
one of its child classloaders (the
system one, for example). Normal J2SE
delegation does not work, and the
workaround is to make the core JNDI
classes use thread context loaders,
thus effectively "tunneling" through
the classloader hierarchy in the
direction opposite to the proper
delegation.
(2) from the same source:
This confusion will probably stay with
Java for some time. Take any J2SE API
with dynamic resource loading of any
kind and try to guess which loading
strategy it uses. Here is a sampling:
JNDI uses context classloaders
Class.getResource() and Class.forName() use the current classloader
JAXP uses context classloaders (as of J2SE 1.4)
java.util.ResourceBundle uses the caller's current classloader
URL protocol handlers specified via java.protocol.handler.pkgs system property are looked up in the bootstrap and system classloaders only
Java Serialization API uses the caller's current classloader by default
Adding to #David Roussel answer, classes may be loaded by multiple class loaders.
Lets understand how class loader works.
From javin paul blog in javarevisited :
ClassLoader follows three principles.
Delegation principle
A class is loaded in Java, when its needed. Suppose you have an application specific class called Abc.class, first request of loading this class will come to Application ClassLoader which will delegate to its parent Extension ClassLoader which further delegates to Primordial or Bootstrap class loader
Bootstrap ClassLoader is responsible for loading standard JDK class files from rt.jar and it is parent of all class loaders in Java. Bootstrap class loader don't have any parents.
Extension ClassLoader delegates class loading request to its parent, Bootstrap and if unsuccessful, loads class form jre/lib/ext directory or any other directory pointed by java.ext.dirs system property
System or Application class loader and it is responsible for loading application specific classes from CLASSPATH environment variable, -classpath or -cp command line option, Class-Path attribute of Manifest file inside JAR.
Application class loader is a child of Extension ClassLoader and its implemented by sun.misc.Launcher$AppClassLoader class.
NOTE: Except Bootstrap class loader, which is implemented in native language mostly in C, all Java class loaders are implemented using java.lang.ClassLoader.
Visibility Principle
According to visibility principle, Child ClassLoader can see class loaded by Parent ClassLoader but vice-versa is not true.
Uniqueness Principle
According to this principle a class loaded by Parent should not be loaded by Child ClassLoader again
Recently I came accross the java custom class loader api. I found one use over here, kamranzafar's blog
I am a bit new to the class loader concept. Can any one explain in detail, what are the different scenarios where we may need it or we should use it?
Custom class loaders are useful in larger architectures consisting of several module/applications. Here are the advantages of the custom class loader:
Provides Modular architecture Allows to define multiple class loader allowing modular architecture.
Avoiding conflicts Clearly defines the scope of the class to within the class loader.
Support Versioning Supports different versions of class within same VM for different modules.
Better Memory Management Unused modules can be removed which unloads the classes used by that module, which cleans up memory.
Load classes from anywhere Classes can be loaded from anywhere, for ex, Database, Networks, or even define it on the fly.
Add resources or classes dynamically All the above features allows you add classes or resources dynamically.
Runtime Reloading Modified Classes Allows you to reload a class or classes runtime by creating a child class loader to the actual class loader, which contains the modified classes.
The primary use is in Application servers so that they can run two applications and not have the classes conflict. i.e. if application 1 has a class with the same name as application 2, with a custom class loader application 1 will load its class and application 2 will load its class.
Also if a class is loaded by a custom class loader it is possible to unload that class from the JVM. Again useful in application servers.
Another use would be for instrumentation - One way of doing aspect oriented programming or when using some persistence API's. With a custom classloader you can add behaviour to the loaded classes before they are passed over to the running application.
Java class loaders do pretty much what the name suggests: load classes into memory so that they can be used.
Classes are also linked with the ClassLoader that loaded them.
Custom class loaders therefore open up a variety of interesting possibilities:
Loading multiple versions of the same class with different classloaders (e.g. to resolve possible versioning conficts for example)
Loading and unloading classes dynamically at runtime
Generating new classes (e.g. JVM languages like Clojure use various classloading tricks to generate new compiled classes to represent Clojure functions at runtime)
Loading classes from non-standard sources
Normal Java applications don't usually need to worry about classloaders. But if you are writing a framework or platform that needs to host other code then they become much more important / relevant.
I'm trying to understand the security model used when the JVM is asked to load classes.
From the JVM specification on Sandboxing, I'm given to believe that a standard JVM implementation should maintain at least one other ClassLoader, independent of the primordial ClassLoader. This is used to load the application class files (from a provided classpath for example).
If the class is requested from the ClassLoader that is not in it's namespace, java/lang/String for example, then it forwards the request to the primordial ClassLoader, which attempts to load the class from the Java API, if its not there then it throws a NoClassDefFoundError.
Am I right in thinking that the primordial ClassLoader only loads classes from the Java API namespace, and all other classes are loaded via a separate ClassLoader implementation?
And that this makes the loading of classes more secure because it means that a malicious class cannot masquerade as a member of the Java API (lets say java/lang/Virus) because this is a protected namespace, and cannot be used in the current ClassLoader?
But is there anything to prevent the Classes of the Java API being replaced by malicious classes, or would that be detected during class verification?
For historical reasons the names used for class loaders are a little peculiar. The boot class loader loads the systems classes. The system class loader, by default, loads classes from the class path not the system classes. The system classes are in jre/lib (mostly in rt.jar), endorsed directories and anywhere added through -Xbootclasspath.
On the Sun/Oracle JRE, rt.jar contains classes with packages starting with java., javax., sun., com.sun., org.omg, org.w3c and org.xml.
Untrusted code (including configuration) should not be able to add to the system classes. Some packages name prefixed may be restricted through a security property. The java. prefix is specially protected against for non-technical reasons.
Generally a class loader will delegate to its parent before defining a new class, preventing any classes from an ancestor loader from being replaced. Java EE recommends (even though Java SE bans) having some class loaders prefer their own classes, typically to use a more up to date API or a different implementation. This allows shadowing of classes, but only as seen through that loader (and its children). All other classes continue to link to the original.
I understand how to use Dynamic Proxies in Java but what I don't understand is how the VM actually creates a dynamic proxy. Does it generate bytecode and load it? Or something else? Thanks.
At least for Sun's implementation, if you look at the source code of java.lang.reflect.Proxy you'll see that yes, it generates the byte code on-the-fly (using the class sun.misc.ProxyGenerator).
I suggest that you read Dynamic Proxy Classes:
The Proxy.getProxyClass method returns
the java.lang.Class object for a proxy
class given a class loader and an
array of interfaces. The proxy class
will be defined in the specified class
loader and will implement all of the
supplied interfaces. If a proxy class
for the same permutation of interfaces
has already been defined in the class
loader, then the existing proxy class
will be returned; otherwise, a proxy
class for those interfaces will be
generated dynamically and defined in
the class loader. [emphasis mine]
The proxy class is generated on-the fly(hence dynamic proxy) and loaded by the classloader. That's why if you debug applications that relies on JDK proxying you'll see bunch of classes named 'com.sun.proxy.$Proxy0'.
To test my theory you can use an example from Dynamic proxy classes along with the VM parameter -verbose:class which will tell you the loaded classes by the classloader and you shall notice among the classes loaded the com.sun.proxy.$Proxy0.