I think I am failing to understand java package structure, it seemed redundant to me that java files have a package declaration within, and then are also required to be present in a directory that matches the package name. For example, if I have a MyClass.java file:
package com.example;
public class MyClass {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}
Then I would be required to have this file located in com/example, relative to the base directory, and I would execute java com.example.MyClass from the base directory to run it.
Why wouldn't the compiler be able to infer the package name by looking at the directory structure? For example, if I compiled the file from the base directory javac com\example\MyClass.java, I am not understanding why the MyClass.java wouldn't implicity belong to the com.example package.
I understand there is a default package, but it still seems that the package declaration in the source file is redundant information?
As you (implicitly) acknowledged, you are not required to declare the name of a package in the case of the default package. Let us put that quibble aside ...
The reason for this seeming redundancy is that without a package declaration, the meaning of Java1 source code would be ambiguous. For example, a source file whose pathname was "/home/steve/project/src/com/example/Main.java" could have 7 different fully qualified names, depending on how you compiled the code. Most likely, only one of those will be the "correct" one. But you wouldn't be able to tell which FQN is correct by looking at (just) the one source file.
It should also be noted that the Java language specification does not require you to organize the source code tree according to the packages. That is a requirement of a (large) family of Java compilers, but a conformant compiler could be written that did not require this. For example:
The source code could be held in a database.
The source code could be held in a file tree with random file names2.
In such eventualities, the package declaration would not be duplicative of file pathnames, or (necessarily) of anything. However, unless there was some redundancy, finding the correct source "file" for a class would be expensive for the compiler ... and problematic for the programmer.
Considerations like the above are the practical reason that most Java tool chains rely on file tree structure to locate source and compiled classes.
1 - By this, I mean hypothetical dialect of Java which didn't require package declarations.
2 - The compiler would need to scan the file tree to find all Java files, and parse them to work out which file defined which class. Possible, but not very practical.
Turn the question on its head:
Assume that the package statement is the important thing - It represents the namespace of the class and belongs in the class file.
So now the question is - Why do classes have to be in folders that match their package?
The answer is that it makes finding them much easier - it is just a good way to organize them.
Does that help?
You have to keep in mind that packages do not just indicate the folder structure. The folder structure is the convention Java adopted to match the package names, just like the convention that the class name must match the filename.
A package is required to disambiguate a class from other classes with the same name. For instance java.util.Date is different from java.sql.Date.
The package also gives access to methods or members which are package-private, to other classes in the same package.
You have to see it the other way round. The class has all the information about itself, the class name and the package name. Then when the program needs it, and the class is not loaded yet, the JVM knows where to look for it by looking at the folder structure that matches the package name and the class with the filename matching its class name.
In fact there's no such obligation at all.
Oracle JDKs javac (and I believe most other implementations too) will happily compile your HelloWorld class, no matter what directory it is in and what package you declare in the source file.
Where the directory structure comes into the picture is when you compile multiple source files that refer to each other. At this point the compiler must be able to look them up somehow. But all it has in the source code is the fully qualified name of the referred class (which may not even have been compiled yet).
At runtime the story is similar: when a class needs to be loaded, its fully qualified name is the starting point. Now the class loader's job is to find a .class file (or an entry in a ZIP file, or any other imaginable source) based on the FQN alone, and again the simplest thing in a hierarchical file system is to translate the package name into a directory structure.
The only difference is that at runtime your "standalone" class too has to be loaded by the VM, therefore it needs to be looked up, therefore it should be in the correct folder structure (because that's how the bootstrap class loader works).
Related
For a given Java source code file I want to list all the (fully qualified names of) classes that are (directly) required for compilation. In other words: All classes that are directly used by the code in the source code file, coming from imports, fully qualified names in the source code and other compile time means, but not by reflection or other runtime means.
Is there a way to "ask" the java compiler for this list? Are there other ways to get it?
PS: By "directly" I mean the following: If my source code file requires class A for compilation which uses class B, then class B has to be present to compile the code, but it is not a direct use.
You can try this:
It can be eaily done with just javac. Delete your class files and then compile the main class. javac will recursively compile any classes it needs (providing you don't hide package private classes/interfaces in strangely named files).
I plan on becoming a certified Java programmer and am studying from the Sierra-Bates book. I had a question about classpaths. Do classpaths need to find only the supporting classes of the class I'm running/compiling, or the supporting classes and the class itself? Also, when I'm getting classes in packages from classpaths, is it legal to just put the adress of the file(the path to it), instead of putting it's root package. Thanks.
1 - a classpath has to give access to each class that needs to run in your program. That would include the main class and any classes it calls and those they call. If there is some code in one of those classes that is never called, in many cases, you don't need to have the classes referenced by the uncalled code.
2 - you have to put the root of the packages in the classpath. So a class "com.bob.myprog.Main" would need to have the class path point to the folder where the "com" package/folder lies. It will need to contain a "bob" folder and "bob" will need to contain a "myprog" folder with "Main.class" in it.
Classpath has to contain both the supporting classes and the class itself.
However, sometimes you can run a single file without specifying classpath (and it will work).
As specified in http://docs.oracle.com/javase/tutorial/essential/environment/paths.html :
The default value of the class path is ".", meaning that only the
current directory is searched. Specifying either the CLASSPATH
variable or the -cp command line switch overrides this value.
Therefore, if you have a class MyClass compiled in the current directory, the following will work:
java MyClass
while pointing classpath to another directory will lead to an error (classpath no longer contains MyClass):
java -cp lib MyClass
When you have a class in a package, it is not enough to put the address to the class file in the classpath. According to SCJP Sun Certified Programmer for Java 5 Study Guide:
In order to find a class in a package, you have to have a directory in
your classpath that has the package's leftmost entry (the package's
"root") as a subdirectory.
When a classfile that belongs to a package,
then
package PackageName;
is included in the source code of that file.
So when jvm is invoked by writing
java PackageName.classfilename
it gets executed.
Is it that "package PackageName" guarantees the jvm that this classfile belongs to this very package?
Because if we omit the "package PackageName" statement, then jvm still finds out the class file but gives
Exception in thread "main" java.lang.NoClassDefFoundError: Classfilename
wrongname PackageName/ClassfileName
It means jvm finds out the file but there is some reason for which it considers that this classfile has a wrong name.
The package declarations on your classes must match the folder structure that you have for your code.
Packages are used by the JVM for several "tasks", from the visibility of methods, to the resolution of situations where two classes could have the same name.
A NoClassDefFoundError actually means the JVM cannot find the class with the package you gave it. If you ommit the package definition on the class, and run the program like:
java ClassFileName
The JVM will find the class, as long as you're running the java command from the folder where your class is.
Also... package names should be all lowercase and Class names should start with an Uppercase. :) Conventions are really helpful when someone else is reading your code!
Hope the comment helped.
The class file needs to exist on the file system in the same hierarchy as is defined in the package name. If you remove the package name, I believe you must have the file in the root folder of your jar to work in the "unnamed" package. Likely you removed the package line from the source file but still left the class definition inside of the PackageName folder.
First of all: I'm not entirely familiar with Java, and the few things I know I have learned while playing with Java.
However, there is something I have noticed in pretty much any Opensource Java project - the use of alot of subdirectories for the sources, which usually look like so:
./src/main/java/com/somedomainname/projectname/sourcefile.java
Now, why so many subdirectories? what's the deal with the domainname?
The domain name is used for the package name - so that file would be for the class
com.somedomainname.projectname.sourcefile
where com.somedomainname.projectname is the package.
Conventionally, source file organization mirrors the package layout. The normal Java compiler doesn't actually enforce directory structure (although some IDEs such as Eclipse will complain if you put things in the "wrong" directories) but it does force public classes to be in a file with the same name. Non-public classes can go in any file, but conventionally the filename matches the class name there, too. It makes it very easy to navigate to any class without any prior knowledge.
The Java language specification doesn't say that a compiler must enforce the convention for public classes; it explicitly says that it can though. See section 7.2 of the JLS for more details.
This directory structure is used as a convention that shows where the library is from and separates it from other sources.
One reason to use this structure is that is the standard used by Maven.
Maven is a build tool that helps to manage the dependencies of a project. Maven is designed for convention over configuration, so you will often see this directory structure to make it work with Maven.
Maven specifies that the directory structure start with /src/main/java for Java files, and the rest is based on the naming convention for namespaces.
The use of the domain name in the path is to prevent class collisions. If 2 different libraries both supply a class with the same name, the domain name namespace allows them to both be used.
A Java package is a mechanism for
organizing Java classes into
namespaces similar to the modules of
Modula. Java packages can be stored in
compressed files called JAR files,
allowing classes to download faster as
a group rather than one at a time.
Programmers also typically use
packages to organize classes belonging
to the same category or providing
similar functionality.
...from http://en.wikipedia.org/wiki/Java_package
subdirectories as an organizational tool so that you don't just have one directory with tons of java files. The reason you often see a domain name is that conventionally people derive java package names from their domain names in order to prevent collisions with other developers. So although we both might have a util.Stringutil class, if I name mine com.mydomain.util.Stringutil and yours is com.yourdomain.util.Stringutil, we can have a project containing both classes without a collision.
There is an interesting read on java packages and directories in the newer O'Reilly book Java: The Good Parts (starting at the bottom of page 46).
...the required interaction between the package system and the filesystem
is both regrettable and a pain...
This is meant as a standard to define unique locations for java source code. It is convention to follow this package structure, which is why you see it everywhere. It's not required to do it that way - you can name your packages whatever you want. It is very commonplace to follow this convention, however.
package prefix.organization.project.ClassName;
package prefix.organization.project.package.ClassName;
package prefix.organization.project.package.subpackage.ClassName;
When storing Java source code files, each part of the package name translates into a subdirectory. So the same three classes shown above would be located in the corresponding directories off the main classpath.
prefix/organization/project/ClassName.java
prefix/organization/project/package/ClassName.java
prefix/organization/project/package/subpackage/ClassName.java
When compiling by hand, be sure that the main classpath directory is the current directory or is within the classpath in order that the source code files can be found.
As for the src/main/java part of it, it seems this comes from Maven. I've never used that software. I don't understand why they would need so many, since my projects (I use Eclipse) just have a src folder there instead.
./src/main/java/com/somedomainname/projectname/sourcefile.java : Decomposed
src/main/java
this is the directory that needs to be passed to the javac compiler stating where the source code for compilation can be found.
1.1 src/test/java
this is where the unit test classes should be kept.
1.2 src/main/resources and src/test/resources
these are the corresponding directories where resources such as properties files should be kept.
1.3 Separate output directories.
main and *test * classes and resources should be compiled to their own separate output directories. Maven uses target/classes and target/test-classes. When you jar your compiled class files for distribution, you don't want to include test classes and test resource files.
com/somedomainname/projectname
this directory structure corresponds to the package declaration in the classes found in projectname i.e. package com.somedomainname.projectname
SourceFile.java corresponds to the class name that it defines, and it should by convention start with an uppercase character see http://www.oracle.com/technetwork/java/codeconvtoc-136057.html
Also in the link above you will find out that the default package naming convention uses the domain name in reverse.
The Java Language Specification defines a package naming convention that says that package names should include a domain name, as it provides a globally-rooted namespace.
The source files need to be in subfolders that match the package name, because the Sun Java compiler, javac, enforces strongly encourages it. Additionally, many other build tools and IDEs also either strongly encourage or require that the source .java files are stored in paths that match the package.
I'm currently getting to know Java and OSGi, so I've read a few books. In one particular book the class loading is described.
You can download it (free and legal) from the authors page (Neil Bartlett):
OSGi Book
On page 9 and 10 are this pictures:
alt text http://img265.imageshack.us/img265/4127/picture1qct.pngalt text http://img297.imageshack.us/img297/594/picture2yv.png
It seems like there is the possibility that our class "Foo" won't use the class "Bar" of foobar.jar, but instead class "Bar" from naughty.jar.
Because of the flat and global structure of the Java classpath this could be, but as far as I know you would define a package from where you want to import a certain class:
import foobar.Bar
This should prevent loading the wrong class, shouldn't it? Of course assuming that the package is called "foobar".
The import statement has nothing to do with classloading - it just allows you to use short name Bar instead of the fully qualified foobar.Bar. If both foobar.jar and naughty.jar contain class with fully qualified name foobar.Bar, classloader will load the class from the from the first jar file with required class on the classpath.
good idea, but unfortunately packages are independent of the jar file names. you can have things in arbitrary packages all in the same jar file, and arbitrary jar file names with unrelated package names. it's up to the classloader to resolve them for you.
The problem is both foobar.jar and naughty.jar might have a class that its fully qualified name is foobar.Bar. Then foobar.Foo resolves the foobar.Bar of naughty.jar instead of foobar.Bar of foobar.jar.
Hope this helps.
The author is assuming here that both versions of the Bar classes are in the same package (or there wouldn't be any problem). What the author is describing can happen if naughty.jar is "first" in the class path. In that case, the classloader would pick the naughty version (the classloader scans the classpath in the natural order and picks the first class found).
The import doesnt allow you the liberty of loading the class from the desired java. You can read more about classloaders from here Java Classloaders