I always doubt when creating packages, I want to take advantage of the package limited access but at the same time I want to have similar classes divided into packages.
The problem comes when you understand that packages are not hierarchical in Java:
At first, packages appear to be
hierarchical, but they are not.
source
Imagine I have an API defined with its classes at foo.bar, only the classes the API client needs are set public. Then I have another package with some internal objects I need in the API defined at foo.bar.pojos, this classes need to be public so they can be accessed by foo.bar but this means the API client could also access them if the package foo.bar.pojos is imported.
What is the common package politic that should be followed?
I've seen two ways of doing.
The first one consists in separating the public API and internal classes into two different artefacts (jars). The documentation is separated as well, and it's thus easy for the end user to make the distinction between what is internal and what is not. But it sometimes make things more complex to have two jars, two source trees, etc.
The second one consists in delivering a single jar, but have a good documentation allowing to know what's internal and what's not. The textual documentation can explain how to use the API (and thus avoids talking about the internals). And the javadoc can specify that a class is for internal use and is thus subject to changes.
Yes, Java packages don't give you enough control over your dependencies. The classic way to deal with this is to put external APIs in one package and internal implementation classes in another, and rely on people's good sense to avoid creating dependencies on the latter.
With Maven and OSGI, you have an additional mechanism for managing dependencies between modules / bundles of packages. In the case of OSGI, you can explicitly declare some packages as not exported, and an OSGI aware development environment will prevent people creating harmful dependencies. Maven's module support is weaker, but at least it controls dependency cycles.
Finally, you could use custom PMD rules to enforce your project's modularization conventions ... in the same way that there are rules to discourage dependencies on Java's "com.sun.*" package tree.
It is a mess.
Using only what Java itself offers, you have to put everything in the same package. You end up with a single (or a few) packages with lots of classes, and no good way to group them for yourself (but at least that problem does not leak outside). Most people don't do that, though, and as a result, your (as a developer on top of these libraries) public classpath is littered with stuff you should never need to see.
You might like OSGi, which has (and enforces) the concept of bundle-private packages. Those are not exported to the outside world.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am a newbie and just learned that if I define say
package my.first.group.here;
...
then the Java files that are in this package will be placed under my/first/group/here directory.
What is the main purpose of putting some Java files in a package? Also, if I choose to adopt this, how should I group them?
Thank you
EDIT: For anyone who might have the same question again, I just found this tutorial on packages from Sun.
Let's start with the definition of a "Java package", as described in the Wikipedia article:
A Java package is a mechanism for
organizing Java classes into
namespaces similar to the modules of
Modula. Java packages can be stored in
compressed files called JAR files,
allowing classes to download faster as
a group rather than one at a time.
Programmers also typically use
packages to organize classes belonging
to the same category or providing
similar functionality.
So based on that, packages in Java are simply a mechanism used to organize classes and prevent class name collisions. You can name them anything you wish, but Sun has published some naming conventions that you should use when naming packages:
Packages
The prefix of a unique package name is
always written in all-lowercase ASCII
letters and should be one of the
top-level domain names, currently com,
edu, gov, mil, net, org, or one of the
English two-letter codes identifying
countries as specified in ISO Standard
3166, 1981.
Subsequent components of the package
name vary according to an
organization's own internal naming
conventions. Such conventions might
specify that certain directory name
components be division, department,
project, machine, or login names.
Examples:
com.sun.eng
com.apple.quicktime.v2
edu.cmu.cs.bovik.cheese
I a large application, you are bound to have two files named exactly the same (java.util.Date and java.sql.Date), especially when you start bringing in third party jars. So basically, you can use packages to ensure uniqueness.
Most importantly, in my opinion, packaging breaks down projects into meaningful segments. So my SQL package has sql-related code, and my logger package handles logging.
In addition to the namespacing mentioned in other answers, you can limit access to methods and fields based on the scope declared on that member.
Members with the public scope are freely accessible, to limit access you normally define them as private (i.e. hidden outside the class).
You can also use the protected scope to limit access to the type and its children.
There is also the default scope (a member with no qualifier has the default scope) which allows child types and types in the same package access to the member. This can be an effective way of sharing fields and methods without making them too widely available, and can help with testing.
For example the method below would be visible to all other members of the same package.
public class Foo {
int doSomething() {
return 1;
}
}
To test the method you could define another type in the same package (but probably a different source location), that type would be able to access the method.
public class FooTest {
#Test
int testDoSomething() {
Foo foo = new Foo();
assertEquals(1, foo.doSomething());
}
}
It allows the program to be composed from multiple different programs/components/libraries, so that their class names will not conflict and the components are easier to organize. See http://java.sun.com/docs/books/tutorial/java/package/index.html
In Java it's customary to name packages as reverse domain names. For example, if your company's domain is "initech.com" and you are making a program called "Gizmo", the package names are typically prefixed "com.initech.gizmo", with subpackages for different components of the program.
Packages are important for giving flexibility of classes separation. They can be used for:
separating projects
separating modules
separating application layers (business, web, dao)
further finer grained code separation
For example
com.mycompany.thisproject.thismodule.web
Could indicate the web layer of some module.
Ultimately, there are 3 core reasons we want to use packages in Java.
1) Easier Maintenance
Organizing classes into packages follows the separation of concerns principle by encapsulation and allows for better cohesion in the overall system design. Moving further, packaging-by-feature allows teams of developers to find relevant classes and interfaces for making changes, supporting vertical-slicing techniques for scaled approaches used in agile methodology. For more information, see blog post: Package your classes by Feature and not by Layers and Coding: Packaging by vertical slice.
2) Provide Package security
Packages allow external access to only public access modifiers on methods in contained classes. Using the protected or no modifier will only be accessible to classes within the same package. For more information, see post:
Which Java access modifier allows a member to be accessed only by the subclasses in other package?
3) Avoid similar naming
Similar to the namespaces of .NET, class names are contained within the scope of their containing package. This means that two mutually exclusive packages can contain classes with the same name. This is because the packages themselves have different names and therefore, the fully qualified names are different. For more information, see tutorial [Naming a Package: The Java Tutorials][3].
From the Wikipedia page on the topic:
"A Java package is a mechanism for organizing Java classes into namespaces similar to the modules of Modula. Java packages can be stored in compressed files called JAR files, allowing classes to download faster as a group rather than one at a time. Programmers also typically use packages to organize classes belonging to the same category or providing similar functionality."
also, if i choose to adopt this, how
should i group them?
This depends largely on the design pattern(s) you will employ in your project. For the most part (particularly, if you're quite new) you'll want to group them by functionality or some other logical similarity.
Other people have provided very Java-specific answers which are fine, but here's an analogy: why do you organize files into directories on your hard drive? Why not just have a flat file system with everything in one directory?
The answer, of course, is that packages provide organization. The part of the program that interfaces with the database is different than the part of the program that displays a UI to the user, so they'll be in different packages.
Like directories, it also provides a way to solve name conflicts. You can have a temp.txt in a couple different directories in the same way that you could have two classes that appear in different packages. This becomes important (1) when you start combining code with other people out there on the internet or (2) even realize how Java's classloading works.
Another important thing about packages is the protected member for access control.
Protected is somewhere between public (everyone can access) and private (only class internal can access). Things marked as protected can be accessed from within the same package or from subclasses. This means that for limited access you don't have to put everything in the same class.
Java is very exact in its implementation. It doesn't really leave room for fudging.
If everyone were to use the same package, they would have to find some "World Wide" way to ensure that no two class names ever collided.
This lets every single class ever written fit into its own "Place" that you don't have to look at if you don't want to.
You may have different "Point" objects defined in 4 different places on your system, but your class will only use the one you expect (because you import that one).
The way they ensure that everyone has their own space is to use your reverse domain, so mine is "tv.kress.bill". I own that domain--Actually I share it with my brother "tv.kress.doug" and even though we share the same domain, we can't have a collision.
If a hundred divisions in your company each develop in Java, they can do so without collision and knowing exactly how to divide it.
Systems that don't do this kind of division seem really flaky to me now. I might use them to hack together a script for something personal, but I'd feel uncomfortable developing anything big without some strict packaging going on.
I expected it's possible to use i.e. Guava-19 in myModuleA and guava-20 in myModuleB, since jigsaw modules have their own classpath.
Let's say myModuleA uses Iterators.emptyIterator(); - which is removed in guava-20 and myModuleB uses the new static method FluentIterable.of(); - which wasn't available in guava-19. Unfortunately, my test is negative. At compile-time, it looks fine. In contrast to runtime the result is a NoSuchMethodError. Means that, the class which was the first on the classloader decides which one fails.
The encapsulation with the underlying coupling? I found a reason for myself. It couldn't be supported because of transitive dependencies would have the same problem as before. If a guava class which has version conflicts occurred in the signature in ModuleA and ModuleB depends on it. Which class should be used?
But why all over the internet we can read "jigsaw - the module system stops the classpath hell"? We have now multiple smaller "similar-to-classpaths" with the same problems. It's more an uncertainty than a question.
Version Conflicts
First a correction: You say that modules have their own class path, which is not correct. The application's class path remains as it is. Parallel to it the module path was introduced but it essentially works in the same way. Particularly, all application classes are loaded by the same class loader (by default at least).
That there is only a single class loader for all application classes also explains why there can't be two versions of the same class: The entire class loading infrastructure is built on the assumption that a fully qualified class name suffices to identify a class with a class loader.
This also opens the path to the solution for multiple versions. Like before you can achieve that by using different class loaders. The module system native way to do that would be to create additional layers (each layer has its own loader).
Module Hell?
So does the module system replace class path hell with module hell? Well, multiple versions of the same library are still not possible without creating new class loaders, so this fundamental problem remains.
On the other hand, now you at least get an error at compile or launch due to split packages. This prevents the program from subtly misbehaving, which is not that bad, either.
Theoretically it is possible to use different versions of the same library within your application. The concept that enables this: layering!
When you study Jigsaw under the hood you find a whole section dedicated to this topic.
The idea is basically that you can further group modules using these layers. Layers are constructed at runtime; and they have their own classloader. Meaning: it should be absolutely possible to use modules in different versions within one application - they just need to go into different layers. And as shown - this kind of "multiple version support" is actively discussed by the people working on java/jigsaw. It is not an obscure feature - it is meant to support different module versions under one hood.
The only disclaimer at this point: unfortunately there are no "complete" source code examples out there (of which I know), thus I can only link to that Oracle presentation.
In other words: there is some sort of solution to this versioning problem on the horizon - but it will take more time until to make experiences in real world code with this new idea. And to be precise: you can have different layers that are isolated by different class loaders. There is no support that would allow you that "the same object" uses modV1 and modV2 at the same time. You can only have two objects, one using modV1 and the other modV2.
( German readers might want to have a look here - that publication contain another introduction to the topic of layers ).
Java 9 doesn't solve such problems. In a nutshell what was done in java 9 is to extend classic access modifiers (public, protected, package-private, private) to the jar levels.
Prior to java 9, if a module A depends on module B, then all public classes from B will be visible for A.
With Java 9, visibility could be configured, so it could be limited only to a subset of classes, each module could define which packages exports and which packages requires.
Most of those checks are done by the compiler.
From a run time perspective(classloader architecture), there is no big change, all application modules are loaded by the same classloader, so it's not possible to have the same class with different versions in the same jvm unless you use a modular framework like OSGI or manipulate classloaders by yourself.
As others have hinted, JPMS layers can help with that. You can use them just manually, but Layrry might be helpful to you, which is a fluent API and configuration-based launcher for running layered applications. It allows you to define the layer structure by means of configuration and it will fire up the layer graph for you. It also supports the dynamic addition/removal of layers at runtime.
Disclaimer: I'm the initial creator of Layrry
I'm looking for different ways to prevent internals leaking into an API. This is a huge problem because once these internals leak into the API; you can run either into unexpected incompatibility issues or into frozen internals.
One of the simplest ways to do so is just make use of different Maven modules; one module with API and one module with implementation. This way it is impossible to expose the implementation from the API.
Unfortunately not everyone agrees this is the best approach; But are there other alternatives? E.g using checkstyle or other 'architecture checking' tools?
PS: Java 9 for us is not usable, since we are about to upgrade to Java 8 and this will be the lowest supporting version for quite some time to come.
Following your checkstyle idea, it should be possible to set up rules which examine import statements in source files.
Checkstyle has built-in support for that, specifically the IllegalImport and ImportControl rules.
This of course works best if public and internal classes can be easily separated by package names.
The idea for IllegalImport would be that you configure a TreeWalker in checkstyle which only looks at your API-sources, and which excludes imports from internal packages.
With the ImportControl rule on the other hand you can define very detailed access rules for the whole application/module in a separate XML file.
It is standard in Java to define an API using interfaces and implement them using classes. That way you can change the "internals" however you want and nothing changes for the user(s) of the API.
One alternative is to have one module (Jar file) for API and implementation (but then again, is it an API or just any kind of library?). Inside one separates classes and interfaces by using packages, e.g. com.acme.stuff.api and com.acme.stuff.impl. It is important to make classes inside the latter package protected or just package-protected.
Not only does the package name show the consuming developer "hey, this is the implementation", it is also not possible to use anything inside (let's omit reflections at this point for the sake of simplicity).
But again: This is against the idea of an API, because usually the implementation can be changed. With this approach one cannot separate API from implementation, because both are inside the same module.
If it is only about hiding internals of a library, then this is one (not the one) feasible approach.
And just in case you meant a library instead of an API, which only exposes its "frontend" (by using interfaces or abstract classes and such), use different package names, e.g. com.acme.stuff and com.acme.stuff.internal. The same visibility rules apply of course.
Also: This way one does not need Checkstyle and other burdens.
Here is a good start : http://wiki.netbeans.org/API_Design
Key point : Do not expose more than you want Obviously the less of the implementation is expressed in the API, the more flexibility one can have in future. There are some tricks that one can use to hide the implementation, but still deliver the desired functionality
I think you don't need any checkstyle or anything like that, just a good old solid design and architecture should be enough. Polymorphism is all you need here.
One of the simplest ways to do so is just make use of different Maven
modules; one module with API and one module with implementation. This
way it is impossible to expose the implementation from the API.
Yes, I totally agree, hide as much as possible, separate your interface in a standalone project.
Is there some sort of tool that you can point at a set of Java classes, and it produces output showing the transitive imports of each class?
I understand that imports are not "transitive" from the point of view of the language itself - i.e. if com.acme.X imports com.acme.Y, and com.acme.Y imports com.acme.Z, that does not mean that you can refer to com.acme.Z within com.acme.X. But that's not what I mean:
Rather, I mean that com.acme.X nonetheless depends upon com.acme.Z (at least under the current implementations of X and Y), and I want to know that fact. In fact I want to know it for a large number of classes, and so I'm hoping that there's a tool do determine it automatically.
Either a standalone tool or an Eclipse plugin or feature would be great.
Thanks in advance.
EDIT to hopefully show what I want this for:
I have a huge monolithic jar that contains many features that are (essentially) completely unrelated. I'd like to break it apart into several smaller, more manageable, and more self-coherent jars.
Unfortunately, I can't do it simply by breaking it up based on packages, because many of the packages themselves are not self-coherent either. That is, for example, there's a "com.acme.utils" package. Two things in that package are probably have nothing in common except for the fact that they're both, in some sense, "utilities". One may be a utility for some particular business function, another may be TCP/IP utilities, another may be a set of string utilities, another may be some completely unrelated business function.
And there are a bunch of packages like this. So when you look at the transitivity of imports from the point of view of packages, they snowball without limit, and so more or less everything in the monolithic jar depends on everything else in the monolithic jar.
So I'd like to start by considering transitivity of imports from the class point of view, rather than the package point of view. That way I should be able to more easily determine what classes need to be reorganized from what existing packages into new, more coherent packages, and then after that I can break the monolithic jar apart by packages / sets of packages.
we're using sonar for software metrics. http://www.sonarsource.org/
Not to keep all my classes in a single src -> 'package_name' folder I'm creating different sub-packages in order to separate my classes by groups like - utilities, models, activities themselves, etc. I'm not sure if it is a good practice and people do the same in real projects.
Yes, it's definitely standard practice to separate your classes into packages. It's good to establish a convention for how they are separated, to make it easier to find things later. Two common approaches:
Put things into packages based on what they are: model, service, data access (DAO), etc.
Put things into packages based on what function they support (for example, java.io, java.security, etc.
I've used both and keep coming back to the former because it's less subjective (it's always clear whether a class is a model or a service, but not always clear whether it supports one function or another function).
Doing it by class type the way you describe is one way that I've seen in real projects. I don't care for it as much as I used to because when I need to make a change or add a feature I tend to need to have several packages expanded in my IDE. I prefer (when I have the choice) to group classes by feature instead. That way I know where to look for all classes that support that feature.
The convention I prefer is to group classes first by module, then by functionality. For example, you could have the following structure:
com.example.modulea - modulea specific code that doesn't have any real need of a different package
com.example.modulea.dao - data access for module a
com.example.modulea.print - printing for module a
...
com.example.moduleb - moduleb specific code that doesn't have any real need of a different package
com.example.moduleb.dao - data access for module b
com.example.moduleb.print - printing for module b
In this fashion, code is clearer by package.
In the other style, of grouping by pure functionality, the package size tends to be quite large. If your project contains 15 modules, and each module has one or more elements per package, that's at least 15 classes per package. I much prefer clearly separated packages than packages that simply group things because "oh here are some printing utilities that are used for every module but only one module actually uses one of them from this package" - it just gets confusing.