Trying to understand Java Classloading - java

I'm currently getting to know Java and OSGi, so I've read a few books. In one particular book the class loading is described.
You can download it (free and legal) from the authors page (Neil Bartlett):
OSGi Book
On page 9 and 10 are this pictures:
alt text http://img265.imageshack.us/img265/4127/picture1qct.pngalt text http://img297.imageshack.us/img297/594/picture2yv.png
It seems like there is the possibility that our class "Foo" won't use the class "Bar" of foobar.jar, but instead class "Bar" from naughty.jar.
Because of the flat and global structure of the Java classpath this could be, but as far as I know you would define a package from where you want to import a certain class:
import foobar.Bar
This should prevent loading the wrong class, shouldn't it? Of course assuming that the package is called "foobar".

The import statement has nothing to do with classloading - it just allows you to use short name Bar instead of the fully qualified foobar.Bar. If both foobar.jar and naughty.jar contain class with fully qualified name foobar.Bar, classloader will load the class from the from the first jar file with required class on the classpath.

good idea, but unfortunately packages are independent of the jar file names. you can have things in arbitrary packages all in the same jar file, and arbitrary jar file names with unrelated package names. it's up to the classloader to resolve them for you.

The problem is both foobar.jar and naughty.jar might have a class that its fully qualified name is foobar.Bar. Then foobar.Foo resolves the foobar.Bar of naughty.jar instead of foobar.Bar of foobar.jar.
Hope this helps.

The author is assuming here that both versions of the Bar classes are in the same package (or there wouldn't be any problem). What the author is describing can happen if naughty.jar is "first" in the class path. In that case, the classloader would pick the naughty version (the classloader scans the classpath in the natural order and picks the first class found).

The import doesnt allow you the liberty of loading the class from the desired java. You can read more about classloaders from here Java Classloaders

Related

In JAVA, how do I determine where an import is coming from?

Am not using any IDE and trying to fix some errors for a deployed war file in tomcat. I am trying to look for source of a package and Erroneous line seems to require parameters and has been imported from some package like
import com.somefirm.somepackage.someClass;
Following questions did not have answer my question:
In eclipse determine which jar file a class is from
How can I find files imported in a java class
I want to know is there any way I can find source of import manually. Is it even possible or not? How does a class look for packages to import?
Edit 1: Separated the links with a newline.
Edit 2: "am not using any IDE at the moment" was a bit late in the question. SO added Am not using IDE to first line.
Edit 3: Provided more clarity to the question, as to why I am needing it.
Edit 4: Added these edits. Thanks to #Jude-niroshan and #ErwinBolwidt
The point is: the exact location of a class is determined by your class path setup.
You define at some point which classes are available when compiling your application respectively which classes to ship with it.
So, when you are not using an ide - you have to search the "elements" in the class path that gets applied for your build. For example by looking into each jar file.
Given your comments: I think you have to step back. You seem to lack basic knowledge of Java. You have to understand how that WAR file is built. There should be some sort of build description; containing the dependencies and other contents of the WAR delivery. You have to analyse those. Beyond that: if these packages are from your team/company/... a simple file search might do the job. If those packages are "external", like open source libraries - then you might try to simple google for the class name; or turn to grepcode.com.
And the other thing you asked: a compiled class contains only fully qualified class names. There are no import statements in class files any more. So when a class "needs" another class, it asks the JVM to load that class (given the fully qualified name). And the JVM simply looks into the classpath, and loads the first class that matches the given name.

Why do java source files require package declarations?

I think I am failing to understand java package structure, it seemed redundant to me that java files have a package declaration within, and then are also required to be present in a directory that matches the package name. For example, if I have a MyClass.java file:
package com.example;
public class MyClass {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}
Then I would be required to have this file located in com/example, relative to the base directory, and I would execute java com.example.MyClass from the base directory to run it.
Why wouldn't the compiler be able to infer the package name by looking at the directory structure? For example, if I compiled the file from the base directory javac com\example\MyClass.java, I am not understanding why the MyClass.java wouldn't implicity belong to the com.example package.
I understand there is a default package, but it still seems that the package declaration in the source file is redundant information?
As you (implicitly) acknowledged, you are not required to declare the name of a package in the case of the default package. Let us put that quibble aside ...
The reason for this seeming redundancy is that without a package declaration, the meaning of Java1 source code would be ambiguous. For example, a source file whose pathname was "/home/steve/project/src/com/example/Main.java" could have 7 different fully qualified names, depending on how you compiled the code. Most likely, only one of those will be the "correct" one. But you wouldn't be able to tell which FQN is correct by looking at (just) the one source file.
It should also be noted that the Java language specification does not require you to organize the source code tree according to the packages. That is a requirement of a (large) family of Java compilers, but a conformant compiler could be written that did not require this. For example:
The source code could be held in a database.
The source code could be held in a file tree with random file names2.
In such eventualities, the package declaration would not be duplicative of file pathnames, or (necessarily) of anything. However, unless there was some redundancy, finding the correct source "file" for a class would be expensive for the compiler ... and problematic for the programmer.
Considerations like the above are the practical reason that most Java tool chains rely on file tree structure to locate source and compiled classes.
1 - By this, I mean hypothetical dialect of Java which didn't require package declarations.
2 - The compiler would need to scan the file tree to find all Java files, and parse them to work out which file defined which class. Possible, but not very practical.
Turn the question on its head:
Assume that the package statement is the important thing - It represents the namespace of the class and belongs in the class file.
So now the question is - Why do classes have to be in folders that match their package?
The answer is that it makes finding them much easier - it is just a good way to organize them.
Does that help?
You have to keep in mind that packages do not just indicate the folder structure. The folder structure is the convention Java adopted to match the package names, just like the convention that the class name must match the filename.
A package is required to disambiguate a class from other classes with the same name. For instance java.util.Date is different from java.sql.Date.
The package also gives access to methods or members which are package-private, to other classes in the same package.
You have to see it the other way round. The class has all the information about itself, the class name and the package name. Then when the program needs it, and the class is not loaded yet, the JVM knows where to look for it by looking at the folder structure that matches the package name and the class with the filename matching its class name.
In fact there's no such obligation at all.
Oracle JDKs javac (and I believe most other implementations too) will happily compile your HelloWorld class, no matter what directory it is in and what package you declare in the source file.
Where the directory structure comes into the picture is when you compile multiple source files that refer to each other. At this point the compiler must be able to look them up somehow. But all it has in the source code is the fully qualified name of the referred class (which may not even have been compiled yet).
At runtime the story is similar: when a class needs to be loaded, its fully qualified name is the starting point. Now the class loader's job is to find a .class file (or an entry in a ZIP file, or any other imaginable source) based on the FQN alone, and again the simplest thing in a hierarchical file system is to translate the package name into a directory structure.
The only difference is that at runtime your "standalone" class too has to be loaded by the VM, therefore it needs to be looked up, therefore it should be in the correct folder structure (because that's how the bootstrap class loader works).

Java Package Vs Folder-Structure? what is the difference

I would like to know What are the difference between folder-structure and package used in Eclipse IDE for Java EE development.
When do we use which one and why?.
Whats should be the practice
create a folder structure like src/com/utils and then create a class inside it
create a package like src.com.util and then create a class inside it
which option would be better and easy to deploy if i have to write a ant script later for deployment ?
if i go for the folder-structure will the deployment is as easy as copying files from development to deployment target ?
If you configured stuffs correctly. Adding a folder inside src, is same as adding a package from File > New Package.
So, it's up to you, whatever feels comfortable to you -- add a folder or create a package. Also, when you put stuffs under src the package name starts from subfolder. So, src/com/naishe/test will be package com.naishe.test.
Basically there is no difference, both are the same.
In both the cases, the folder structure will be src/com/utils.
and in both the cases, you will need to mention
package com.utils;
as first line in the class
Since it doesn't have any difference practically, it won't make any difference to ant script.
"Packaging helps us to avoid class name collision when we use the same class name as that of others. For example, if we have a class name called "Vector", its name would crash with the Vector class from JDK. However, this never happens because JDK use java.util as a package name for the Vector class (java.util.Vector). So our Vector class can be named as "Vector" or we can put it into another package like com.mycompany.Vector without fighting with anyone. The benefits of using package reflect the ease of maintenance, organization, and increase collaboration among developers. Understanding the concept of package will also help us manage and use files stored in jar files in more efficient ways."
check out http://www.jarticles.com/package/package_eng.html for more information on packages
create a package like 'src.com.util'
That sounds like a mistake. The package name should be 'com.util', and 'src' is the name of the source folder.
Other than that, I fail to see what the difference is between your two choices. The result is the same, right? Just different steps in the GUI to arrive at it. The wizard to create a new package in Eclipse is just a wrapper around creating the appropriate folder hierarchy within a source folder.
You don't need to create empty packages at all, you can directly create classes (the package will be created automatically if it does not already exist).
A package is automatically "source folder" where folder is just a normal folder.
When you compile an Eclipse project, all files in source folders are compiled but not in regular folders (unless those regular folders a)
folder structure or to be specific source folder in eclipse is meant just for eclipse but package is universal irrespective of any editor..

How can I include a file from the directory above in Java?

I'm not sure how to import a file from a directory above. That is, I have a setup like so
directory: MyProject
Main.java
directory: Other
Other.java
Basically, Main.java is in "MyProject" and Other.java is in a folder inside the project's root folder. I can easily do
import Other.*;
to get those files available in Main, but how do I get Main.java to be visible to Other.java?
import ../Main.java
Obviously this doesn't work, but that's the general functionality I'm looking for. Any suggestions? I would prefer not having to use absolute paths. Thanks!
Edit: I meant import not include. Sorry. Been using C++ too much.
Java does not include files. You can however directly use classes using the simple name by using import statements.
Basically you need a file per (top level) class you define. This allows IDE's to rename compilation units, and do other refactorings. Besides that, it lets you easily add code at the right spot.
Java does use packages to create namespaces. Packages themselves are completely separate namespaces. Although the namespace seems to be a tree structure, in Java each package is actually not related to any other package. Hence you cannot use it as a folder structure, using .. is not allowed. This may change once "super packages" are introduced.
The Java import statement looks a lot like #include, but the name change is deliberate: instead of grabbing the file to make the definitions in that file known, it is simply a statement to make it easier to refer to classes and interfaces. It has no other effect than having a shorter name to a class (or, for import static, constants and other static members).
Most of the time the top level classes are represented using a folder structure that reflects the package name. This makes it easy for IDE's and developers to find the file representing the class. It also makes it easy to put in version control. It is however not part of the Java specification itself; the location of Java source and classes is not defined. Earlier IBM IDE's actually stored Java source and classes in a database for instance; they did not use files at all. Newer IDE's such as Eclipse may use different source folders, e.g. one for Unit test files and one for the library itself.
So finally, the only way to include packages is by specifying the full package name, then a dot and then the class to import, or the * wildcard to import all classes of that package.
import java.util.Vector;
import java.util.*;
Most IDE's will create these import statements for you, possibly after you have chosen the right class to import (in case there are classes with the same name in different packages).
More information can be found in the Java Language Specification (Java 7 version).
In your case you have defined a Main class in the root or default package which is strongly discouraged. You can directly refer to Main without any import statement. The Other class is in the identically named Other package (using uppercase in package names is strongly discouraged as well). You can refer to it by using import Other.Other.
include ???
Java doesn't have file source inclusion support, it rather use a naming conversions, so you should import the namespace (package) that you need in your source file.
You should define a package for your main class and then import it in the Other class .
the Main.java is in the default package, this is impossible to import from other (named) packages
put it in a package and import as normal
directory: MyProject
directory: base
Main.java
directory: other
Other.java
(also package names are lowercase normally)
if you have file outside of your project it means this file:
wouldn't be compiled by project
wouldn't get into jar
can't be used in runtime
so you really shouldn't include it.
Either move it into project, or include dependent project which contains that file.
Java is not like C++. You include by package name. So if toplevel file is in project AAA in folder src/aaa then you should include that project as dependent jar and refer to file as import aaa.Main
I think import Main; should just work.
You should read up java concepts package and classpath. Please look at the documentation here. The options that will work for you are sourcepath and classpath.

Java project structure explained for newbies?

I come from a .NET background and am completely new to Java and am trying to get my head around the Java project structure.
My typical .NET solution structure contains projects that denote logically distinct components, usually named using the format:
MyCompany.SomeApplication.ProjectName
The project name usually equals the root namespace for the project. I might break the namespace down further if it's a large project, but more often than not I see no need to namespace any further.
Now in Java, you have applications consisting of projects, and then you have a new logical level - the package. What is a package? What should it contain? How do you namespace within this App.Project.Package structure? Where do JARs fit into all this? Basically, can someone provide a newbies intro to Java application structure?
Thanks!
Edit: Some really cracking answers thanks guys. A couple of followup questions then:
Do .JAR files contain compiled code? Or just compressed source code files?
Is there a good reason why package names are all lower case?
Can Packages have 'circular dependencies'? In other words, can Package.A use Package.B and vice versa?
Can anyone just show the typical syntax for declaring a class as being in a package and declaring that you wish to reference another package in a class (a using statement maybe?)
"Simple" J2SE projects
As cletus explained, source directory structure is directly equivalent to package structure, and that's essentially built into Java. Everything else is a bit less clear-cut.
A lot of simple projects are organized by hand, so people get to pick a structure they feel OK with. What's often done (and this is also reflected by the structure of projects in Eclipse, a very dominant Java tool) is to have your source tree begin in a directory called src. Your package-less source files would sit directly in src, and your package hierarchy, typically starting with a com directory, would likewise be contained in src. If you CD to the src directory before firing up the javac compiler, your compiled .class files will end up in the same directory structure, with each .class file sitting in the same directory and next to its .java file.
If you have a lot of source and class files, you'll want to separate them out from each other to reduce clutter. Manual and Eclipse organization often place a bin or classes directory parallel to src so the .class files end up in a hierarchy that mirrors that of src.
If your project has a set of .jar files to deliver capability from third-party libraries, then a third directory, typically lib, is placed parallel to src and bin. Everything in lib needs to be put on the classpath for compilation and execution.
Finally, there's a bunch of this and that which is more or less optional:
docs in doc
resources in resources
data in data
configuration in conf...
You get the idea. The compiler doesn't care about these directories, they're just ways for you to organize (or confuse) yourself.
J2EE projects
J2EE is roughly equivalent to ASP.NET, it's a massive (standard) framework for organizing Web applications. While you can develop your code for J2EE projects any way you like, there is a firm standard for the structure that a Web container will expect your application delivered in. And that structure tends to reflect back a bit to the source layout as well.
Here is a page that details project structures for Java projects in general (they don't agree very much with what I wrote above) and for J2EE projects in particular:
http://maven.apache.org/guides/introduction/introduction-to-the-standard-directory-layout.html
Maven projects
Maven is a very versatile project build tool. Personally, my build needs are nicely met by ant, which roughly compares with nmake. Maven, on the other hand, is complete-lifecyle build management with dependency management bolted on. The libs and source for most of the code in the Java world is freely available in the 'net, and maven, if asked nicely, will go crawling it for you and bring home everything your project needs without you needing to even tell it to. It manages a little repository for you, too.
The downside to this highly industrious critter is the fact that it's highly fascist about project structure. You do it the Maven way or not at all. By forcing its standard down your throat, Maven manages to make projects worldwide a bit more similar in structure, easier to manage and easier to build automatically with a minimum of input.
Should you ever opt for Maven, you can stop worrying about project structure, because there can only be one. This is it: http://maven.apache.org/guides/introduction/introduction-to-the-standard-directory-layout.html
A package in Java is very similar to a namespace in .Net. The name of the package essentially creates a path to the classes that live inside it. This path can be thought of as the class's namespace (in .Net terms) because it is the unique identifier for the specific class you want to use. For example if you have a package named:
org.myapp.myProject
And inside it you had a bunch of classes:
MyClass1
MyClass2
To specifically refer to those classes you would use:
org.myapp.myProject.MyClass1
org.myapp.myProject.MyClass2
The only real difference between this and .Net (that I know of) is that Java organizes its "namespaces" structurally (each package is a distinct folder) whereas .Net allows you to scope classes using the namespace keyword and ignores where the document actually lives.
A JAR file is roughly analogous to a DLL in most cases. It is a compressed file (you can open them with 7zip) that contains source code from other projects that can be added as dependencies in your application. Libraries are generally contained in JARs.
The thing to remember about Java is that is is very structural; WHERE files live is important. Of course there is more to the story then what I posted but I think this should get you started.
A package is much like a .Net namespace. The general convention in Java is to use your reversed domain name as a package prefix so if your company is example.com your packages will probably be:
com.example.projectname.etc...
It can be broken down to many levels rather than just one (projectname) but usually one is sufficient.
Inside your project structure classes are usually divided into logical areas: controllers, models, views, etc. It depends on the type of project.
There are two dominant build systems in Java: Ant and Maven.
Ant is basically a domain-specific scripting language and quite flexible but you end up writing a lot of boilerplate stuff yourself (build, deploy, test, etc tasks). It's quick and convenient though.
Maven is more modern and more complete and is worth using (imho). Maven is different to Ant in that Maven declares that this project is a "Web application project" (called an archetype). Once that is declared the directory structure is mandated once you specify your groupId (com.example) and artifactId (project name).
You get a lot of stuff for free this way. The real bonus of Maven is that it manages your project dependencies for you so with a pom.xml (Maven project file) and correctly configured Maven you can give that to someone else (with your source code) and they can build, deploy, test and run your project with libraries being downloaded automatically.
Ant gets something like this with Ivy.
Here are some notes about Java packages that should get you started:
The best practice with Java package names is to use the domain name of the organisation as the start of the package, but in reverse, e.g. if your company owns the domain "bobswidgets.com", you would start your package off with "com.bobswidgets".
The next level down will often be the application or library level, so if it's your ecommerce libraries, it could be something like "com.bobswidgets.ecommerce".
Further down than that often represents the architecture of your application. Classes and interfaces that are core to the project reside in the "root" e.g. com.bobswidgets.ecommerce.InvalidRequestException.
Using packages to subdivide functionality further is common. usually the pattern is to put interfaces and exceptions into whatever the root of the subdivision is and the implementation into sub packages e.g.
com.bobswidgets.ecommerce.payment.PaymentAuthoriser (interface)
com.bobswidgets.ecommerce.payment.PaymentException
com.bobswidgets.ecommerce.payment.paypal.PaypalPaymentAuthoriser (implementation)
This makes it pretty easy to pull the "payment" classes and packages into their own project.
Some other notes:
Java packages are tightly coupled to directory structure. So, within a project, a class with a Package of com.example.MyClass will invariably be in com/example/MyClass.java. This is because when it is packaged up into a Jar, the class file will definitely be in com/example/MyClass.class.
Java packages are loosely coupled to projects. It is quite common that projects will have their own distinct package names e.g. com.bobswidgets.ecommerce for ecommerce, com.bobswidgets.intranet for the intranet project.
Jar files will container the class files that are the result of compiling your .java code into bytecodes. They are just zip files with .jar extension. The root of the Jar file is the root of the namespace hierarchy e.g. com.bobswidgets.ecommerce will be /com/bobswidgets/ecommerce/ in the Jar file. Jar files can also container resources e.g. property files etc.
A package is a grouping of source files that lets them see each others' package-private methods and variables, so that that group of classes can access things in each other that other classes can't.
The expectation is that all java classes have a package that is used to disambiguate them. So if you open a jar file in your project, like spring, every package starts with org.springframework. The classloaders don't know about the jarfile name, they use only the package.
There's a common practice of breaking things down by type of object or function, not everybody agrees about this. Like Cletus posted here, there's a tendency to group web controllers, domain objects, services, and data access objects into their own packages. I think some Domain-Driven Design people do not think this is a good thing. It does have the advantage that typically everything in your package shares the same kind of dependencies (controllers might depend on services and domain objects, services depend on domain objects and data access objects, etc.) so that can be convenient.
Okay so in java you have three different types of access to a classes member functions and variables
public
protected
package-private
and private
All classes in the same package can see each others public, protected, and package-private elements.
Packages are not hierarchical in the system. Usually they are organized in a hierarchical way, but as far as runtime is concerned com.example.widgets is a completely different package from com.example.widgets.cogs
Packages are arranged as directories, which helps keep things organized: your file structure is always similar to your package structure.
They are planning on adding a module system to Java in JDK7 (called Project Jigsaw) and there is an existing module system called OSGi. These module systems will/can give you a lot more flexibility and power then the simple package system.
Also, package names are usually all lower case. :)
To answer the example sub-question:
package com.smotricz.goodfornaught;
import java.util.HashMap;
import javax.swing.*;
public class MyFrame extends JFrame {
private HashMap myMap = new HashMap();
public MyFrame() {
setTitle("My very own frame");
}
}
Do .JAR files contain compiled code? Or just compressed source code files?
They might contain both, or even totally different kinds of files like pictures. It's a zip archive first of all. Most often you would see JARs that contain class files, and those which contain source files (handy for debugging in your IDE if you use third party code) or those that contain javadoc (sourcecode documentatin), also handy if your IDE supports tooltipping the documentation when you access the lib's functions.
Is there a good reason why package names are all lower case?
Yes there is a good reason for package names to be written in lowercase letters: There is a guideline which says that only classnames are written with a capital letter in front.
Can Packages have 'circular dependencies'? In other words, can Package.A use Package.B and vice versa?
Packages do not use each other. Only classes do. And yes that might be possible but bad practice.
Can anyone just show the typical syntax for declaring a class as being in a package and declaring that you wish to reference another package in a class (a using statement maybe?)
Let's assume you want to use the ArrayList class from package java.util, either use
import java.util.ArrayList;
ArrayList myList = new ArrayList();
or use without import (say you use two different classes named ArrayList from different packages)
java.util.ArrayList myList = new java.util.ArrayList();
your.package.ArrayList mySecondList = new your.package.ArrayList();
From Wikipedia:
A Java package is a mechanism for
organizing Java classes into
namespaces
and
Java packages can be stored in
compressed files called JAR files
So for package a.b.c, you could have Java classes in the a, a.b, and a.b.c packages. Generally you group classes inside the same package when they represent related functionality. Functionally, the only difference between classes in the same package and classes in different package is that the default access level for members in Java is "package-protected", which means that other classes in the same package have access.
For a class a.b.c.MyClass, if you want to use MyClass in your project you would import a.b.c.MyClass or, less recommended, import a.b.c.* Also, for MyClass to reside in package a.b.c in the first place, you would declare it in the first line of MyClass.java: package a.b.c;.
To do this you could JAR up the whole package (including packages b and c and class MyClass) and put this JAR into your $CLASSPATH; this would make it accessible for your other source code to use (via the aforementioned import statement).
While it is not as easy to make circular dependent classes work, it may not be impossible. I did get it to work in one case. class A and class B depended on each other and wouldn't compile from scratch. but realizing that a part of class A didn't need class B, and that part was what class B needed to compile completely, I rem'd out that part of class A, not needed by class B, and the remaining part of class A was able to compile, then I was able to compile class B. I was then able to un-rem that section of class A that needed class B, and was able to compile the full class A. Both classes then functioned properly. While it is not typical, if the classes are tied together like this, it is kosher and at times possibly necessary. Just make sure you leave yourself special compile instructions for future updates.

Categories

Resources