How Classloader determines which classes it can load?

How Classloader determines which classes it can load? - java

I'm reading on class loading in Java.
Motivation
Assuming we have a classloader hierarchy that looks like this, I understand that classes loaded by First are not directly accessible by classes loaded by Second (and vice versa).
Bootstrap
|
System
|
Common
/ \
First Second
I also understand that a classloader checks with its parent class loader whether it can load the class and, if that is the case, delegates the loading to its parent.
Question
How do classloaders actually determine whether they can load some given class?

That differs depending on the implementation of the classloader. But all Classes a ClassLoader can load are retrieved by ClassLoader.findClass(String)
There are many implementations but the most common one is the URLClassLoader which loads classes from directories and jar files.

The classloader checks all classes (java class files) within your CLASSPATH path variable. If your class is found there, it exists, otherwise it doesn't.
So practically, your /src directory and all subdirectories (=packages) are scanned.

The classloader transforms the requested class name into a file name and then tries to find a "class file" of that name from a file system. As #poitroae notes, it uses the CLASSPATH variable, if set, as a starting place. Most IDEs and such extend this to include your working directories for the project.

Related

Class conflict: two jar files with the same classes

I have two jar files with similar Util class names, but different method signatures.
In jar1, I have a main method which must use the method in Util class in jar1. The JVM is linking to Util class in jar2.
How to resolve this class conflict?

If both jar files are loaded into the same classloader then there is no way to determine which class will get loaded. The only way to handle this is to isolate them so only one of them is loaded into the classloader you are using.
You can set up a classloader and only load the jar you want to get the class from, but it is probably much easier to just make sure classes are unique on your path.

Normally one avoids that situation by using appropriate package names, such that they are different.
In extreme situations, where you dont have the choice to change the jar files,
there is the option "bootclasspath" where you can specify classes that gets loaded first.

Java resource from class vs Thread

what is the difference between
getClass().getResource("some-resource-file.txt")
vs
Thread.currentThread().getContextClassLoader().getResource("some-resource-file.txt")
I have resources in src/test/resources & I am trying to access them from Unit test. It's a typical maven style directory structure.
I was expecting both to behave identical. But it's not., getClass().getResource() doesn't fetch the resource where as from Thread I am able to fetch the resource.
So how do they differ ?

Let's say you're developing a library and the library jar is placed into a web container's classpath.
Now let's say a webapp, using this library, is deployed in the container.
The webapp will have its own class loader, using WEB-INF/classes and WEB-INF/lib/*.jar as its classpath. And the container, for each request coming to your webapp, will set the current thread classloader to the class loader of the classpath.
When your library code uses getClass().getResource(), it will load the resource using the classloader used to load the library classes. It will thus use the container's class loader, and will thus use the resources in your library's jar and in the other libraries used to start the container.
If your library code uses Thread.currentThread().getContextClassLoader() instead to load the resource, it will use the classloader associated with the current thread, and will thus load the resources from the webapp's class loader, looking for the resource in WEB-INF/classes and in the jars inside WEB-INF/lib.
The latter can be what you want. For example, if you're designing a logging library (please don't), the logger will be able to read a different configuration file for each webapp, instead of having a single config shared by all the webapps.
Regarding the way the two methods look for resources, they all finally delegate to a ClassLoader to load the resource. But loading it via a Class will treat relative paths as relative to the invoked class, whereas loading it via a ClassLoader expects a path starting at the root of the package tree. Suppose your class is in the package com.foo, then
MyClass.class.getResource("hello.txt")
is equivalent to
MyClass.class.getResource("/com/foo/hello.txt")
and is equivalent to
MyClass.class.getClassLoader().getResource("com/foo/hello.txt");

There is a special case getting the first class running (which is why you have to declare the main() method as static with an array of strings as an argument).
Once that class is loaded and is running, future attempts at loading classes are done by the class loader. At its simplest, a class loader creates a flat name space of class bodies that are referenced by a string name. Each class in Java uses own classloader to load other classes. So if ClassA.class references ClassB.class then ClassB needs to be on the classpath of the ClassLoader of ClassA, or its parents.
The thread context ClassLoader is a special one in that it is the current ClassLoader for the currently running thread. This is useful in multi-classloader environments. An object can be created from a class in ClassLoader C and then passed to a thread owned by ClassLoader D. In this case the object needs to use Thread.currentThread().getContextClassLoader() directly if it wants to load resources that are not available on its own ClassLoader .

Difference between Package and Directory in Java

In a Java Project, does keeping all the .java files in the same folder mean they are in the same package?
What is the difference in making a Package for our project compared to keeping all the project files in one folder?
This thread doesn't really address my question.

There is a relationship between package and directory, but it's one that you must maintain. If you have a class that's in "mypackage1.mypackage2", that means that the java command is going to expect to find it in a directory structure named "mypackage1\mypackage2" (assuming "backwards" Windows notation), with that directory structure further embedded in a directory (let's call it "myjava") whose name is in the classpath (or else is directly in the "current directory").
So your Java class (which internally says package mypackage1.mypackage2;) is in, say, "\Users\myName\myjava\mypackage1\mypackage2\", and you place "\Users\myName\myjava" in the class path, or else you have your current directory set to "\Users\myName\myjava".
If you mix this up, either the class will not be found at all, or you will get an error something like the ever-nebulous "NoClassDefFoundError".
As to why one would use packages (and directories), the reason has to do with "name space" and "separation of concerns" (look those up). Java would be much harder to keep straight if there were no packages and all the "java.lang", "java.io", "sun.misc", et al classes were together. First off, one would have to use name "prefixes" to keep them straight and avoiding name conflicts. And much of the logical grouping would be lost.
With your own projects you don't need to use packages for simple little programs you write for yourself, but if you write something you might give to someone else it's polite to use a package such as "myname.myproject" (substituting your name and project of course), so the person you give it to can combine it with others without name conflicts.
In large applications you'll find using further levels of separation helps you keep the functions straight, so you know where everything is. It also discourages you from "crossing the boundary" between different functional areas, so you don't get unrelated logic entertwined.
Eclipse (if you use that) kind of muddles the issue a bit because it "wants" to provide directory and package names and will sometimes (but not always) keep them in sync.

Packages provide logical namespace to your classes..
And these packages are stored in the form of directory levels (they are converted to nested directories) to provide physical grouping (namespace) to your classes..
Also, note that the physical namespace has to be in accordance with the logical namespace.. You can't have your class with package com.demo, under directory structure : - \com\demo\temp\, it has to be under \com\demo\, and this directory is added to the classpath so that your classes is visible to JVM when it runs your code..
Suppose you have following directory structure: -
A | +-- Sample.java(contains Demo class under package B) | +-- Outer.java(contains Demo class - no package) | +--B | | |
+-- Demo.class | +--C |
| | +-- Abc.class
| +-- Demo.class
Suppose, your class Abc.class and Demo.class (under directory A), isn't define under any package, whereas your class Demo.class (under directory B) is defined under package B. So, you need to have two directories in your classpath: - \A(for two classes: - Demo.class and B.Demo.class) and \A\C(for class Abc.class)..
Classes under different package can use same name. That is why there won't be any conflict between the two Demo.class defined above.. Because they are in different packages. That is the whole point of dividing them into namespaces.. This is beneficial, because you will not run out of unique names for your classes..

Understanding the class loader subsystem is will answer your query.
With reference to Inside JVM by Bill Venners
Given a fully qualified type name, the primordial class loader must in
some way attempt to locate a file with the type's simple name plus
".class". Hence, JVM searches a user-defined directory path stored in
an environment variable named CLASSPATH. The primordial loader looks
in each directory, in the order the directories appear in the
CLASSPATH, until it finds a file with the appropriate name: the type's
simple name plus ".class". Unless the type is part of the unnamed
package, the primordial loader expects the file to be in a
subdirectory of one the directories in the CLASSPATH. The path name of
the subdirectory is built from the package name of the type. For
example, if the primordial class loader is searching for class
java.lang.Object, it will look for Object.class in the java\lang
subdirectory of each CLASSPATH directory.

Java Classloader unable to load modules?

I seem to be having a problem with loading classes in a module loader for an application I'm developing. Basically, all classes I'm going to be loading with it extend another class, which is located in a package in the actual application. For our purposes, we'll call it Module. Modules are located in a separate folder outside the actual application.
The loader iterates through a folder and executes the loadFile() method on any file with the extension .class. All classes have the package declaration as the Module class, as well as the extends Module declaration in the class header.
This is the loadFile() method, header and exception clauses excluded:
String fileName = file.getName();
String className = fileName.replace(".class", ""); //Strips extension
Class<?> aClass = Class.forName(className, true, new URLClassLoader(new URL[] { file.toURI().toURL() }));
Class<? extends Module> modClass = aClass.asSubclass(Module.class);
return modClass.getConstructor().newInstance();
I keep getting a ClassNotFoundException on the third line. And past that, if it ClassNotFoundException weren't thown, would all dependencies be resolved?

From the documentation for URLCLassLoader:
This class loader is used to load classes and resources from a search path of URLs referring to
both JAR files and directories. Any URL that ends with a '/' is assumed to refer to a directory.
Otherwise, the URL is assumed to refer to a JAR file which will be opened as needed.
So, you must use URLs for either directories or .jars
Two solutions:
Force your users to give you .jar files, including a manifest of some sort inside with the classname that they wish to be loaded. This approach is used by the Bukkit developers. Having used this method in the past, the dependencies should all be packaged in the .jar file and thus in the URLClassLoader's search path and able to be loaded.
Use the URL of the file's directory, and search that directory for .class files. I'm not sure if dependencies will be loaded using this method.

In the URLClassLoader, do not pass the file, but the parent folder. However, this works correctly if classes are all in the "default package", so the .class files you are loading must not have a package declaration on top.
By default, a class loader will also trigger loading of all classes required to properly build the class: it will try to load the super class, the super super class etc... all the interfaces and super interfaces, the classes needed for static fields and methods, the classes needed for method signatures (return types and parameters). It will not usually try to load classes used internally by methods, not until you execute those methods.
However, usually a class loader does not "contain" all those classes, for example your class will end up inheriting from java.lang.Object, and your URLClassLoader will not contain the Object.class file. So, class loaders delegate to their parent class loaders.
You are currently creating a URLClassLoader without specifying a parent, in Java 7 at least the parent will default to the "system class loader", which is fine as long as you are in a plain java application, and not executing your code itself inside a specific hierarchy of class loaders. If however you are running that code in a web application, or in an OSGI container etc.. the you should give the URLClassLoader a proper parent to delegate to, for example Thread.currentThread().getContextClassLoader() or this.getClass().getClassLoader().
I suppose you need all of this because you need to load those class dynamically at runtime.

Class.getResourceAsStream() issue

I have a JAR-archive with java classes. One of them uses some resource that is embedded into the same JAR. In order to load that resource I use
MyClass.class.getResourceAsStream(myResourceName);
One thing that bothers me though is whether it is guaranteed that required resource will be loaded from within the same JAR. The documentation for "getResourceAsStream()" method (and corresponding ClassLoader's method) is not really clear to me.
What would happen if there's a resource with the same name located somewhere in JVM classpath before my JAR? Will that resource be loaded instead of the one embedded in my JAR? Is there any other way to substitute resource embedded in JAR?

Yes. The first matching resource found on the class path is returned, just like an executable search path. This is why resources are often "namespaced" by putting them in directories that mirror the package structure of the library or application.
This behavior may be slightly different in the presence of custom classloaders (say in OSGi), but for vanilla Java apps, it is the case.

It works much the same way as for finding class files. So first try the parent class loader (recursively) then do whatever the class loader implementation does to find files.
There is no checking of the immediate caller class loader (as ResourceBundle does - see section 6.3 of the Java Secure Coding Guidelines). However, you do need permissions to open the URL, as ClassLoader.getResourceAsStream just calls URL.openStream in the default implementation.

Specify the package. Assuming you use com.yourcompany.file it SHOULD be unique. (Unless someone WANTS to override your config file via the classpath.)

If you want to read the file only from a specific JAR you can open the JarFile and read it directly.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How Classloader determines which classes it can load? - java

That differs depending on the implementation of the classloader. But all Classes a ClassLoader can load are retrieved by ClassLoader.findClass(String) There are many implementations but the most common one is the URLClassLoader which loads classes from directories and jar files.

The classloader checks all classes (java class files) within your CLASSPATH path variable. If your class is found there, it exists, otherwise it doesn't. So practically, your /src directory and all subdirectories (=packages) are scanned.

Related

Class conflict: two jar files with the same classes

Java resource from class vs Thread

Difference between Package and Directory in Java

Java Classloader unable to load modules?

Class.getResourceAsStream() issue

Categories

Resources