Java package structure convention

Java package structure convention - java

I was working with Typescript and Javascript and I stopped for a bit thinking about namespaces and how we organize code in Java.
Now, pretty often I see multiple classes with the same purpose, just for use with different data, being placed in different packages with different names.
While placing those classes in different packages is a good practice to maintain the project/module in a good state, why would we have to give them different names? I mean, their purpose is the same.
I can answer this question myself: because often we use those classes inside the same unit and they would clash, thus requiring a long full package specification for one or the other.
But aren't we violating the DRY principle? We should use our directory (package) structure to understand in which domain space those classes works.
As I wrote above, I suppose many developers aren't respecting this DRY principle just to avoid long package names. Then, why are we creating those monstruos package hierarchies?
Googling "Java packages best practices" results in suggestions such as:
com.mycompany.myproduct.[...]
Where do those suggestions come from? Aren't we just wasting space?
We obviously do not want to write that every time.
MyClass myInstance;
com.mycompany.myproduct.mypackage.MyClass myOtherInstance;
But it could have been
myfeature.MyClass myInstance
myotherfeature.MyClass myInstance;
We could even specify the full package for both.
So, where do those best practices come from?
As it has been said, this convention dates back to the first releases of Java.
Name clashes could be easily solved by qualifying the imported dependency (such as a classpath library) classes with their short packages' hierarchy, emphasizing and keeping cleaner our own code. It is also important to remember we have access to the package-level visibility, which seems overlooked nowdays.

As a commenter points out, this convention was established at the very beginning of Java, to allow a global namespace to be used for all classes.
One thing about Java that influences this -- and is different from TypeScript and JavaScript -- the name of a class and its package (as well as all the names of classes and packages it uses) is fixed at compile time. You can't recombine classes into a different hierarchy without modifying their source code. So it's not as flexible as an interpreted language where you can just move the hierarchy around and change the locations from which you load code.
You can, of course, put everything in your application in the top-level package, for example, and be pretty confident you'll get away with it. You'll be able to do that as long as everyone else follows the existing convention. Once they stop doing that, and libraries start putting classes wherever they want, you'll have collisions, and you'll have to change your application. Or libraries will start colliding with each other, and if you don't have the source, you won't really be able to fix it.
So there are good reasons.
My opinion-based part of this -- sticking with conventions and a language's (or platform's) culture makes sense. Even if the conventions are silly, in our view. There's a place for breaking them but the advantages have to be pretty big. Other developers will be used to the convention, tools will make assumptions about the conventions ... usually it makes sense to go with the flow.
Does it really make sense to use getters and setters for every property in Java? If you understand the domain you're using well, it very well might not. But stop doing it and your code isn't going to make sense to other members of your team, you'll have to continuously revisit the decision as people join, and so forth.
If it's a one-person project and always will be, of course, you can do what you want.

Related

rule of thumb relating to access of classes from sibling packages

Is there are any rule of thumb that relates to whether your package structure should allow access of a class from another class in a sibling package.
An example I have a class that represents a Login page:
project.page.login.LoginPage
And a class that represents an Account home page:
project.page.account.AccountHome
Both pages access the std chrome for the project (Header, Footer, Menu stuff and BasePage), is it better to put thoses classes in a sibling package e.g. project.page.chrome
project.page.chrome.BasePage
project.page.chrome.Menu
project.page.chrome.Footer
project.page.chrome.Header
or in the parent package:
project.page
e.g.
project.page.BasePage
project.page.chrome.Menu
etc
I know that this is a stylistic rule of thumb question, which in a way is subjective.
What I wish to know is if there is a commonly accepted rule for this sort of thing. And if so what is the reasoning behind what are the problems or benefits associated with each approach.
Further to Vampire's answer.
My question is not whether you can reference classes from one sibling package to another. It's whether you should and what are the reasons (either way).

There are maybe not exact rules but some guidelines. Link to answer to another similar question Since this answer appears on SO, pasting only the link here. Reading the Uncle Bob articles may give you some pointers.

There is no conventional way to name your packages or grouping your classes in packages. But most of the time people tend to follow the standard Java APIs, and adopted the style from those.
(to name a few)
For e.g.: util, common, basic

you can use this but the best way is to review your specification and make it the most possibly readable and extendable.
the best advice I can give you is learn the rules then forget them to make yours

There is no such thing as a parent- or child-package. Each package in Java is completely stand-alone. Having them named hierarchically and stored like that in the filesystem is just a convention, but technically, those packages are all absolutely non-related stand-alone packages.
It is totally up to you how you like to organize your source-code in packages.

Package allows to combine the related classes/packages together with proper abstraction.
Technically, in Java there is no rule that sibling packages should allow class access from each other.
From design approach and good practices point of view, it is advised to group the classes/packages in a structured and intuitive way.
For eg.,
project.page
project.page.login
project.page.chrome
project.page.account
This facilitates in modularizing and unit testing of the application. Also, proper package hierarchy helps in quick debugging of the application using tools like log4j.

Naming convention for homonymous classes in different libraries

In any project with relatively big number of dependencies there are always a lot of commongly named classes in different libraries. For example, Configuration is very widely used:
It slows down the programmer as he has to carefully pick the right class from the list. It is also very irritating if you have to use different configurations in one class, so they have to be prepended with full package name.
I'm writing a library which also needs a Configuration class. Should I use this name? Or is it better to name it {Libname}Configuration? Is their any common way to avoid such problems?

I think you should name your class in the way that is most clear of it's usage and you shouldn't care too much if another classes with the same name exist. As you know, the usage of packages reduces the risk of naming collisions.
For any project I think it's good that the whole team should have a convention related to basic naming and formatting used. It's important to use a consistent naming convention so that when new people work on it, they pick it up faster. I think that conventions also help increase productivity since it's easier to remember names.
I think it's good to spend some time thinking about classes, not only in terms of algorithms but also as what business part they fill. To think about why is a class necessary and what brings to the project, can make you more aware of the way your method/class/variable works within the application workflow.
That being said, I think that maybe your IDE has some option to hide some of the classes is shows. I'm using IntelliJ and it has a feature for this situation, even though it's a bit hidden.

I think it is usually not a very good idea to start with library name for a class, simply because in the long run that will make it more difficult to remember, and because it diminishes readability.
There are ways to setup your IDE (depending on which IDE you use) so that autocomplete shows the most used classes first. You can also get to a class quickly by first typing the name, then the library, when using autocomplete. These are all dependent on you IDE. But generally it seems like a bad practice to start a classname with the name of the library.

Too many imports are spamming my Java code

In my project I have a shapes package which has shapes I designed for my graphics program, e.g., Rectangle and Circle. I also have one or two more packages that have the same names as java.awt classes.
Now, since I do not want to rename every class in my codebase, to show my source files which class I mean when I, say, declare a new Rectangle, I need to either:
1- import the rectangle class explicitly, i.e., import shapes.Rectangle
or
2- import only the java.awt classes I need and not import java.awt.* which automatically includes the awt.Rectangle
Now the problem is that both ways result in a lot of importing. I currently have an average of 15-25 imports in each source file, which is seriously making my code mixed-up and confusing.
Is too many imports in your code a bad thing? Is there a way around this?

Yes, too many imports is a bad thing because it clutters your code and makes your imports less readable.
Avoid long import lists by using wildcards.
Kevlin Henney talks about this exact Stack Overflow question 27:54 into his presentation Clean Coders Hate What Happens to Your Code When You Use These Enterprise Programming Tricks from NDC London 16-20 Jan 2017

If you use glob imports, it's possible to break your code with a namespace clash just by updating a dependency that introduces new types (typically not expected to be a breaking change). It could be a pain to fix in a large codebase that was liberal with their use of glob imports. That is the strongest reason I can think of for why it's a good idea to specify dependencies explicitly.
It's easier to read code which has each of the imports specified,
because you can see where types are coming from without requiring IDE
specific features and mouse hovering, or going through large pages of library documentation. Many people read a lot of code outside the IDE in code review, diffs, git history, etc.

Another alternative is to type the fully qualified class name as you need it. In my example, there are two Element objects, one created by my org.opensearch.Element and the other org.w3c.dom.Element.
To resolve the name conflict, as well as to minimize import "clutters", I've done this (in my org.opensearch.Element class):
public org.w3c.dom.Element toElement(org.w3c.dom.Document doc) { /* .... */ }
As you can see, the return Element type is fully-typed (i.e., I've specified the fully-qualified class name of Element).
Problem solved! :-)

I use explicit imports, and have done so for many years. In all my projects in the last decade this has been agreed with team members, and all are happy to agree to to use explicit imports, and avoid wildcards. It has not been controversial.
In favour of explicit imports:
precise definition of what classes are used
less fragile as other parts of the codebase changes
easier to code review
no guessing about which class is in which package
In favour of wildcards:
less code
easier to add and maintain the imports when using a text editor
Early in my career, I did use wildcard imports, because back then IDEs were not so sophisticated, or some of us just used text editors. Managing explicit imports manually is quite a bit of effort, so wildcard imports really help.
However at least once I was bitten by the use of wildcard imports, and this lead be to my current policy of explicit only.
Imagine you have a class that has 10 wildcard imports, and it compiles successfully. Then you upgrade 5 jar files to newer versions (you have to upgrade them all, because they are related). Then the code no longer compiles, there is a class not found error. Now which package was that class in? What is the full name of the class? I don't know because I'm only using the short name, now I have to diff the old and new versions of the jars, to see what has changed.
If I had used explicit imports, it would be clear which class had been dropped, and what it's package was, and this which jar (by looking of other classed in that package) is responsible.
There are also problems reading code history, or looking at historic merges. When you have wildcard imports there is uncertainty for the reader about which class is which, and thus what the semantics of the code changes are. With explicit imports you have a precise description of the code, and it acts as a better historical record.
So overall the benefit of the small amount of extra effort to maintain the import, and extra lines of code are easily outweighed by extra precision and determinism given by explicit imports.
The only case when I still use wildcards, is when there are more that 50 imports from the same package. This is rare, and usually just for constants.
Update1: To address the comment of Peter Mortensen, below...
The only defence Kevlin Henney makes in his talk for using wildcards is that the name collision problem doesn't happen very often. But it's happened to me, and I've learnt from it. And I discuss that above.
He doesn't cover all the points I've made in my answer above. -- But most importanly, I think the choice you make, explicit or wildcard, doesn't matter that much, what matters is that everyone on the project/codebase agree and use a consistent style. Kevlin Henney goes on to talk about cargo-cult programming. My decisions as stated above are based on personal lessons over decades, not cargo cult reasoning.
If I was to join a project where the existing style was to use wildcards, I'd be fine with it. But if I was starting a new project I'd use precise imports.
Interestingly in nodejs there is no wildcard option. (Although you do have 'default' imports, but it's not quite the same).

It's subjective and depends greatly on the circumstances. I sometimes bounce between the two.
It is a general good practice to be specific by default but it can also be higher maintenance. It's not a perfect rule. However being more specific (higher initial cost) will tend to reveal itself earlier through measurable or perceptible drag where as being lazy by default tends to manifest as a problem more adversely.
Over including of entire namespaces can create bloat and clashes as well as hide changes in structure but in certain cases it may outweigh the benefit.
To give a simple case:
I use a package with a hundred classes.
What if I use one of its classes?
What if I use all but one of them?
It's similar to the whitelist versus blacklist problem.
In an ideal situation the package hierarchy will subdivide package types enough to establish a good balance.
In certain systems I've seen people do the equivalent of import *. Perhaps the most horrifying case is when I saw someone do something similar to apt install *. Though it seemed clever so as to never have to worry about missing dependencies it imposed enormous burdens that far outweighed any benefit.
All things can be taken to the extreme. For example I could argue for the utility of imports as close to as needed but if we're going to do that why not just always use the fully qualified names all the time?
Problems like this are annoying as the low cognitive load of consistency is always preferable but when it comes down to it in various given circumstances you may need to play it by ear.
It's important to be proportionate. Doing something a thousand times to avoid something that happens one time, self presents and takes about as much effort to fix tends to result in a waste.
Different objectives may make "too many" imports a good thing. I have a custom JavaScript framework which would likely horrify many at a glance for its stacks of imports.
What's not immediately obvious is that this is part of a deliberate strategy.
This allows it to be easy In being able to more cleanly package the code with all its specific dependencies and then transmit that over a network.
Imports have a plug nature at build time to alternate dependencies for each given platform target.
This tends not to be as much as a problem for languages that are less dynamic and that do not suffer greatly (or at all) from the overhead excessively importing namespaces. The moral of this story is that import strategies can vary enormously but be valid for their given circumstances. You can only go so far in taking general approaches.
In each situation you will need to have your bearings and a sense of the lay of the land. If the import conventions and structure is causing a nuisance then it's necessary to narrow down the how and why. Too many imports may not be the result of a specific strategy but things such as packing too much into a single file. At the same time some files are naturally large and naturally require many imports.
Badly applied separation of concerns or organisation in failing to keep related things together can create a graph with grossly excessive edges where that can be reduced with greater organisation. To some degree it's not abnormal for code to be clustered by specific dependencies more so specific dependencies than the more general ones.
If a code base is well organised into a graph that is fairly close to optimal with neither excessive splitting, merging and minimal distance between things you will tend to find that even if being specific with imports the majority of cases will tend to stay within a reasonable size.

I don't get all the non-answers posted here.
Yes, individual imports are a bad idea, and add little, if anything, of value.
Instead just explicitly import the conflicts with the class you want to use (the compiler will tell you about conflicts between the awt and shapes package) like this:
import java.awt.*;
import shapes.*;
import shapes.Rectangle; // New Rectangle, and Rectangle.xxx will use shapes.Rectangle.
This has been done for years, since Java 1.2, with awt and util List classes. If you occasionally want to use java.awt.Rectangle, well, use the full class name, e.g., new java.awt.Rectangle(...);.

It's normal in Java world to have a lot of imports - you really need to import everything. But if you use an IDE, such as Eclipse, it does the imports for you.

It's a good practice to import class by class instead of importing whole packages
Any good IDE, such as Eclipse, will collapse the imports in one line, and you can expand them when needed, so they won't clutter your view
In case of conflicts, you can always refer to fully qualified classes, but if one of the two classes is under your control, you can consider renaming it (with Eclipse, right click on the class, choose menu Refactor → Rename, it will take care to update all its references).
If your class is importing from AWT and from your package of shapes, is ok. It's ok to import from several classes; however, if you find yourself importing from really lots of disparate sources, it could be a sign that your class is doing too much, and need to be split up.

What is a good practice to combine your classes

Not to keep all my classes in a single src -> 'package_name' folder I'm creating different sub-packages in order to separate my classes by groups like - utilities, models, activities themselves, etc. I'm not sure if it is a good practice and people do the same in real projects.

Yes, it's definitely standard practice to separate your classes into packages. It's good to establish a convention for how they are separated, to make it easier to find things later. Two common approaches:
Put things into packages based on what they are: model, service, data access (DAO), etc.
Put things into packages based on what function they support (for example, java.io, java.security, etc.
I've used both and keep coming back to the former because it's less subjective (it's always clear whether a class is a model or a service, but not always clear whether it supports one function or another function).

Doing it by class type the way you describe is one way that I've seen in real projects. I don't care for it as much as I used to because when I need to make a change or add a feature I tend to need to have several packages expanded in my IDE. I prefer (when I have the choice) to group classes by feature instead. That way I know where to look for all classes that support that feature.

The convention I prefer is to group classes first by module, then by functionality. For example, you could have the following structure:
com.example.modulea - modulea specific code that doesn't have any real need of a different package
com.example.modulea.dao - data access for module a
com.example.modulea.print - printing for module a
...
com.example.moduleb - moduleb specific code that doesn't have any real need of a different package
com.example.moduleb.dao - data access for module b
com.example.moduleb.print - printing for module b
In this fashion, code is clearer by package.
In the other style, of grouping by pure functionality, the package size tends to be quite large. If your project contains 15 modules, and each module has one or more elements per package, that's at least 15 classes per package. I much prefer clearly separated packages than packages that simply group things because "oh here are some printing utilities that are used for every module but only one module actually uses one of them from this package" - it just gets confusing.

java package name convention failure

I'm just coming up the learning curve for Java SE & have no problem with the usual Java convention for package names, e.g. com.example.library_name_here.package_name_here
Except.
I've been noticing a failure to abide by this in some fairly well-known packages.
JLine: jline.*
JACOB: com.jacob.* (there is no jacob.com)
JNA: com.sun.jna.* (disclaimer on the site says NOTE: Sun is not sponsoring this project, even though the package name (com.sun.jna) might imply otherwise.)
So I'm wondering, are there instances where the usual reverse-domain-name convention breaks down, and there are good ways to get around it? The only cases I can think of revolve around domain-name ownership issues (e.g. you change the project hosting/domain name, or there's already a well-known package that has "squatter's rights" to your domain, or your ownership of the domain runs out & someone else snaps it up).
edit: if I use my company's domain name, and we are bought out or have a spin-off, what should we do with package names? keep them the same or rename? (I suppose renaming is bad from the point of view that compiled classes referring to the package then lose)

It's a naming convention. There's no real requirement or even expectation that the package name maps to a domain name.

The general idea is that two organizations would not own the same domain, so using the domain name as part of the package ensures that there are no namespace clashes. This is only a recommendation however.
There is a good reason for someone to have packages in the sun namespace. If they are providing an implementation of a public API, it's often necessary to implement the classes in the API's namespace.

If you're making your way up the Java learning curve, I would worry more about making your packaging structure clear so you can easily find the class you are looking for.

Packages are used to avoid ambiguity and collisions between components built by various entities. As long as you follow the convention, and nobody illicitly uses your slice of the package namespace pie, you shouldn't need to worry about what others have used.

The only thing that matters (IMHO) is that the parts of the package name are “sorted” by importance, i.e. that you don’t end up with gui.myprog, util.myprog, main.myprog but with myprog.gui, myprog.util, and myprog.main. Whether the package name really begins with a top-level domain followed by a domain name is of no concern to me.

You can't use language keywords as parts of a package name, that's another case where the domain name convention cannot be applied - tough luck for LONG Building Technologies
But then, the convention is just that, a convention, and pretty much the only reason why it exists is that it minimizes the chance of different projects accidentally choosing the same package name. If you can't follow it, it's not really a big problem.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java package structure convention - java

Related

rule of thumb relating to access of classes from sibling packages

Naming convention for homonymous classes in different libraries

Too many imports are spamming my Java code

What is a good practice to combine your classes

java package name convention failure

Categories

Resources