Difference between Package and Directory in Java - java

In a Java Project, does keeping all the .java files in the same folder mean they are in the same package?
What is the difference in making a Package for our project compared to keeping all the project files in one folder?
This thread doesn't really address my question.

There is a relationship between package and directory, but it's one that you must maintain. If you have a class that's in "mypackage1.mypackage2", that means that the java command is going to expect to find it in a directory structure named "mypackage1\mypackage2" (assuming "backwards" Windows notation), with that directory structure further embedded in a directory (let's call it "myjava") whose name is in the classpath (or else is directly in the "current directory").
So your Java class (which internally says package mypackage1.mypackage2;) is in, say, "\Users\myName\myjava\mypackage1\mypackage2\", and you place "\Users\myName\myjava" in the class path, or else you have your current directory set to "\Users\myName\myjava".
If you mix this up, either the class will not be found at all, or you will get an error something like the ever-nebulous "NoClassDefFoundError".
As to why one would use packages (and directories), the reason has to do with "name space" and "separation of concerns" (look those up). Java would be much harder to keep straight if there were no packages and all the "java.lang", "java.io", "sun.misc", et al classes were together. First off, one would have to use name "prefixes" to keep them straight and avoiding name conflicts. And much of the logical grouping would be lost.
With your own projects you don't need to use packages for simple little programs you write for yourself, but if you write something you might give to someone else it's polite to use a package such as "myname.myproject" (substituting your name and project of course), so the person you give it to can combine it with others without name conflicts.
In large applications you'll find using further levels of separation helps you keep the functions straight, so you know where everything is. It also discourages you from "crossing the boundary" between different functional areas, so you don't get unrelated logic entertwined.
Eclipse (if you use that) kind of muddles the issue a bit because it "wants" to provide directory and package names and will sometimes (but not always) keep them in sync.

Packages provide logical namespace to your classes..
And these packages are stored in the form of directory levels (they are converted to nested directories) to provide physical grouping (namespace) to your classes..
Also, note that the physical namespace has to be in accordance with the logical namespace.. You can't have your class with package com.demo, under directory structure : - \com\demo\temp\, it has to be under \com\demo\, and this directory is added to the classpath so that your classes is visible to JVM when it runs your code..
Suppose you have following directory structure: -
A | +-- Sample.java(contains Demo class under package B) | +-- Outer.java(contains Demo class - no package) | +--B | | |
+-- Demo.class | +--C |
| | +-- Abc.class
| +-- Demo.class
Suppose, your class Abc.class and Demo.class (under directory A), isn't define under any package, whereas your class Demo.class (under directory B) is defined under package B. So, you need to have two directories in your classpath: - \A(for two classes: - Demo.class and B.Demo.class) and \A\C(for class Abc.class)..
Classes under different package can use same name. That is why there won't be any conflict between the two Demo.class defined above.. Because they are in different packages. That is the whole point of dividing them into namespaces.. This is beneficial, because you will not run out of unique names for your classes..

Understanding the class loader subsystem is will answer your query.
With reference to Inside JVM by Bill Venners
Given a fully qualified type name, the primordial class loader must in
some way attempt to locate a file with the type's simple name plus
".class". Hence, JVM searches a user-defined directory path stored in
an environment variable named CLASSPATH. The primordial loader looks
in each directory, in the order the directories appear in the
CLASSPATH, until it finds a file with the appropriate name: the type's
simple name plus ".class". Unless the type is part of the unnamed
package, the primordial loader expects the file to be in a
subdirectory of one the directories in the CLASSPATH. The path name of
the subdirectory is built from the package name of the type. For
example, if the primordial class loader is searching for class
java.lang.Object, it will look for Object.class in the java\lang
subdirectory of each CLASSPATH directory.

Related

Package-private class visible to some other packages (with same name) under a different source folder

When I was doing some testing with packages and package-private classes in Java, I noticed an interesting thing. The following is my projects source structure, the class MyTestClass.java in package com.test.pkg under source folder src is a package-protected class. As per my understanding, this should not be accessible outside this package. But, interestingly MyTestClass.java class is accessible in com.test.pkg under source folder test as well. This happens only if the package names are same, though they are in different source folders.
Can someone tell me why this happens ?
TestProject
|
-src
-com.test.pkg
-MyTestClass.java
-test
+com.test.pkg
The source directory does not matter at all in this case. What is important: the packages names are the same, hence the both classes belong to the same package - everything is correct.

Why do java source files require package declarations?

I think I am failing to understand java package structure, it seemed redundant to me that java files have a package declaration within, and then are also required to be present in a directory that matches the package name. For example, if I have a MyClass.java file:
package com.example;
public class MyClass {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}
Then I would be required to have this file located in com/example, relative to the base directory, and I would execute java com.example.MyClass from the base directory to run it.
Why wouldn't the compiler be able to infer the package name by looking at the directory structure? For example, if I compiled the file from the base directory javac com\example\MyClass.java, I am not understanding why the MyClass.java wouldn't implicity belong to the com.example package.
I understand there is a default package, but it still seems that the package declaration in the source file is redundant information?
As you (implicitly) acknowledged, you are not required to declare the name of a package in the case of the default package. Let us put that quibble aside ...
The reason for this seeming redundancy is that without a package declaration, the meaning of Java1 source code would be ambiguous. For example, a source file whose pathname was "/home/steve/project/src/com/example/Main.java" could have 7 different fully qualified names, depending on how you compiled the code. Most likely, only one of those will be the "correct" one. But you wouldn't be able to tell which FQN is correct by looking at (just) the one source file.
It should also be noted that the Java language specification does not require you to organize the source code tree according to the packages. That is a requirement of a (large) family of Java compilers, but a conformant compiler could be written that did not require this. For example:
The source code could be held in a database.
The source code could be held in a file tree with random file names2.
In such eventualities, the package declaration would not be duplicative of file pathnames, or (necessarily) of anything. However, unless there was some redundancy, finding the correct source "file" for a class would be expensive for the compiler ... and problematic for the programmer.
Considerations like the above are the practical reason that most Java tool chains rely on file tree structure to locate source and compiled classes.
1 - By this, I mean hypothetical dialect of Java which didn't require package declarations.
2 - The compiler would need to scan the file tree to find all Java files, and parse them to work out which file defined which class. Possible, but not very practical.
Turn the question on its head:
Assume that the package statement is the important thing - It represents the namespace of the class and belongs in the class file.
So now the question is - Why do classes have to be in folders that match their package?
The answer is that it makes finding them much easier - it is just a good way to organize them.
Does that help?
You have to keep in mind that packages do not just indicate the folder structure. The folder structure is the convention Java adopted to match the package names, just like the convention that the class name must match the filename.
A package is required to disambiguate a class from other classes with the same name. For instance java.util.Date is different from java.sql.Date.
The package also gives access to methods or members which are package-private, to other classes in the same package.
You have to see it the other way round. The class has all the information about itself, the class name and the package name. Then when the program needs it, and the class is not loaded yet, the JVM knows where to look for it by looking at the folder structure that matches the package name and the class with the filename matching its class name.
In fact there's no such obligation at all.
Oracle JDKs javac (and I believe most other implementations too) will happily compile your HelloWorld class, no matter what directory it is in and what package you declare in the source file.
Where the directory structure comes into the picture is when you compile multiple source files that refer to each other. At this point the compiler must be able to look them up somehow. But all it has in the source code is the fully qualified name of the referred class (which may not even have been compiled yet).
At runtime the story is similar: when a class needs to be loaded, its fully qualified name is the starting point. Now the class loader's job is to find a .class file (or an entry in a ZIP file, or any other imaginable source) based on the FQN alone, and again the simplest thing in a hierarchical file system is to translate the package name into a directory structure.
The only difference is that at runtime your "standalone" class too has to be loaded by the VM, therefore it needs to be looked up, therefore it should be in the correct folder structure (because that's how the bootstrap class loader works).

Using classpaths

I plan on becoming a certified Java programmer and am studying from the Sierra-Bates book. I had a question about classpaths. Do classpaths need to find only the supporting classes of the class I'm running/compiling, or the supporting classes and the class itself? Also, when I'm getting classes in packages from classpaths, is it legal to just put the adress of the file(the path to it), instead of putting it's root package. Thanks.
1 - a classpath has to give access to each class that needs to run in your program. That would include the main class and any classes it calls and those they call. If there is some code in one of those classes that is never called, in many cases, you don't need to have the classes referenced by the uncalled code.
2 - you have to put the root of the packages in the classpath. So a class "com.bob.myprog.Main" would need to have the class path point to the folder where the "com" package/folder lies. It will need to contain a "bob" folder and "bob" will need to contain a "myprog" folder with "Main.class" in it.
Classpath has to contain both the supporting classes and the class itself.
However, sometimes you can run a single file without specifying classpath (and it will work).
As specified in http://docs.oracle.com/javase/tutorial/essential/environment/paths.html :
The default value of the class path is ".", meaning that only the
current directory is searched. Specifying either the CLASSPATH
variable or the -cp command line switch overrides this value.
Therefore, if you have a class MyClass compiled in the current directory, the following will work:
java MyClass
while pointing classpath to another directory will lead to an error (classpath no longer contains MyClass):
java -cp lib MyClass
When you have a class in a package, it is not enough to put the address to the class file in the classpath. According to SCJP Sun Certified Programmer for Java 5 Study Guide:
In order to find a class in a package, you have to have a directory in
your classpath that has the package's leftmost entry (the package's
"root") as a subdirectory.

How Classloader determines which classes it can load?

I'm reading on class loading in Java.
Motivation
Assuming we have a classloader hierarchy that looks like this, I understand that classes loaded by First are not directly accessible by classes loaded by Second (and vice versa).
Bootstrap
|
System
|
Common
/ \
First Second
I also understand that a classloader checks with its parent class loader whether it can load the class and, if that is the case, delegates the loading to its parent.
Question
How do classloaders actually determine whether they can load some given class?
That differs depending on the implementation of the classloader. But all Classes a ClassLoader can load are retrieved by ClassLoader.findClass(String)
There are many implementations but the most common one is the URLClassLoader which loads classes from directories and jar files.
The classloader checks all classes (java class files) within your CLASSPATH path variable. If your class is found there, it exists, otherwise it doesn't.
So practically, your /src directory and all subdirectories (=packages) are scanned.
The classloader transforms the requested class name into a file name and then tries to find a "class file" of that name from a file system. As #poitroae notes, it uses the CLASSPATH variable, if set, as a starting place. Most IDEs and such extend this to include your working directories for the project.

Java: Point of subdirectories

First of all: I'm not entirely familiar with Java, and the few things I know I have learned while playing with Java.
However, there is something I have noticed in pretty much any Opensource Java project - the use of alot of subdirectories for the sources, which usually look like so:
./src/main/java/com/somedomainname/projectname/sourcefile.java
Now, why so many subdirectories? what's the deal with the domainname?
The domain name is used for the package name - so that file would be for the class
com.somedomainname.projectname.sourcefile
where com.somedomainname.projectname is the package.
Conventionally, source file organization mirrors the package layout. The normal Java compiler doesn't actually enforce directory structure (although some IDEs such as Eclipse will complain if you put things in the "wrong" directories) but it does force public classes to be in a file with the same name. Non-public classes can go in any file, but conventionally the filename matches the class name there, too. It makes it very easy to navigate to any class without any prior knowledge.
The Java language specification doesn't say that a compiler must enforce the convention for public classes; it explicitly says that it can though. See section 7.2 of the JLS for more details.
This directory structure is used as a convention that shows where the library is from and separates it from other sources.
One reason to use this structure is that is the standard used by Maven.
Maven is a build tool that helps to manage the dependencies of a project. Maven is designed for convention over configuration, so you will often see this directory structure to make it work with Maven.
Maven specifies that the directory structure start with /src/main/java for Java files, and the rest is based on the naming convention for namespaces.
The use of the domain name in the path is to prevent class collisions. If 2 different libraries both supply a class with the same name, the domain name namespace allows them to both be used.
A Java package is a mechanism for
organizing Java classes into
namespaces similar to the modules of
Modula. Java packages can be stored in
compressed files called JAR files,
allowing classes to download faster as
a group rather than one at a time.
Programmers also typically use
packages to organize classes belonging
to the same category or providing
similar functionality.
...from http://en.wikipedia.org/wiki/Java_package
subdirectories as an organizational tool so that you don't just have one directory with tons of java files. The reason you often see a domain name is that conventionally people derive java package names from their domain names in order to prevent collisions with other developers. So although we both might have a util.Stringutil class, if I name mine com.mydomain.util.Stringutil and yours is com.yourdomain.util.Stringutil, we can have a project containing both classes without a collision.
There is an interesting read on java packages and directories in the newer O'Reilly book Java: The Good Parts (starting at the bottom of page 46).
...the required interaction between the package system and the filesystem
is both regrettable and a pain...
This is meant as a standard to define unique locations for java source code. It is convention to follow this package structure, which is why you see it everywhere. It's not required to do it that way - you can name your packages whatever you want. It is very commonplace to follow this convention, however.
package prefix.organization.project.ClassName;
package prefix.organization.project.package.ClassName;
package prefix.organization.project.package.subpackage.ClassName;
When storing Java source code files, each part of the package name translates into a subdirectory. So the same three classes shown above would be located in the corresponding directories off the main classpath.
prefix/organization/project/ClassName.java
prefix/organization/project/package/ClassName.java
prefix/organization/project/package/subpackage/ClassName.java
When compiling by hand, be sure that the main classpath directory is the current directory or is within the classpath in order that the source code files can be found.
As for the src/main/java part of it, it seems this comes from Maven. I've never used that software. I don't understand why they would need so many, since my projects (I use Eclipse) just have a src folder there instead.
./src/main/java/com/somedomainname/projectname/sourcefile.java : Decomposed
src/main/java
this is the directory that needs to be passed to the javac compiler stating where the source code for compilation can be found.
1.1 src/test/java
this is where the unit test classes should be kept.
1.2 src/main/resources and src/test/resources
these are the corresponding directories where resources such as properties files should be kept.
1.3 Separate output directories.
main and *test * classes and resources should be compiled to their own separate output directories. Maven uses target/classes and target/test-classes. When you jar your compiled class files for distribution, you don't want to include test classes and test resource files.
com/somedomainname/projectname
this directory structure corresponds to the package declaration in the classes found in projectname i.e. package com.somedomainname.projectname
SourceFile.java corresponds to the class name that it defines, and it should by convention start with an uppercase character see http://www.oracle.com/technetwork/java/codeconvtoc-136057.html
Also in the link above you will find out that the default package naming convention uses the domain name in reverse.
The Java Language Specification defines a package naming convention that says that package names should include a domain name, as it provides a globally-rooted namespace.
The source files need to be in subfolders that match the package name, because the Sun Java compiler, javac, enforces strongly encourages it. Additionally, many other build tools and IDEs also either strongly encourage or require that the source .java files are stored in paths that match the package.

Categories

Resources