Do you follow any design guidelines in java packaging?
is proper packaging is part of the design skill? are there any document about it?
Edit : How packages has to depend on each other?, is cyclic packages unavoidable?, not about jar or war files.
My approach that I try to follow normally looks like this:
Have packages of reasonable size. Less then 3 classes is strange. Less then 10 is good. More then 30 is not acceptable. I'm normally not very strict about this.
Don't have dependency cycles between packages. This one is tough since many developers have a hard time figuring out any way to keep the dependencies cycle free. BUT doing so teases out a lot of hidden structure in the code. It becomes easier to think about the structure of the code and easier to evolve it.
Define layer and modules and how they are represented in the code. Often I end up with something like <domain>.<application>.<module>.<layer>.<arbitrary substructure as needed> as the template for package names
No cycles between layers; no cycles between modules.
In order to avoid cycles one has to have checks. Many tools do that (JDepend, Sonar ...). Unfortunatly they don't help much with finding ways to fix cycles. That's why I started to work on Degraph which should help with that by visualizing dependencies between classes, packages, modules and layer.
Packaging is normally about release management, and the general guidelines are:
consistency: when you are releasing into integration, pre-production or production environment several deliveries, you want them organized (or "packaged") exactly the same way
small number of files: when you have to copy a set of files from one environment to another, you want to copy as many as possible, if their number is reasonable (10-20 max per component to deliver), you can just copy them (even if those files are important in size)
So you want to define a common structure for each delivery like:
aDelivery/
lib // all jar, ear, war, ...
bin // all scripts used to launch your application: sh, bat, ant files, ...
config // all properties files, config files
src // all sources zipped into jars
docs // javadoc zipped
...
Plus, all those common directory structures should be stored into one common repository (a VCS, or a maven repo, or...), in order to be queried, without having to rebuilt them every time you need them (you do not need that if you have only one or two delivery components, but when you have 40 to 60 of them... a full rebuilt is out of the question).
You can find a lot of information here:
What strategy do you use for package naming in Java projects and why?
The problem with packaging in Java is that it has very little relation to what you would like to do. For example, I like following the Eclipse convention of having packages marked internal, but then I can't define their classes with a "package" protection level.
Related
Is there a way to automatically find out which Java classes are actually loaded (either during compile time, as far as that's possible, or during the runtime of an application), and to throw out all other classes from a JAR to create a smaller JAR? Does that actually make sense in practice?
I am talking about the application classes for an application JAR. Usually there are lots of libraries in an application, and an application rarely needs all features of those libraries. So I suspect that would make a considerably smaller application. In theory that might be done for example via an Java agent that logs which classes and resources are read by one or several runs of an application (or even just by java -verbose:class), and a maven plugin that throws out all other classes from a jar-with-dependencies. Is there already something like that?
Clarification: I am not talking about unused dependencies (JARs that are not used at all), but about removing unused parts of each included JAR.
Well, the Maven Shade Plugin has an option minimizeJar when creating an Uber-JAR for your application:
https://maven.apache.org/plugins/maven-shade-plugin/
But, as others already pointed out, this is quite dangerous, as it regularly fails to detect class accesses which are done via Reflection or other dynamic references.
It may not be a good approach automate, as application can use reflection to initialise objects or one JAR is dependent on another JAR.
Only way that I can think of is to remove each JARs one by one and check if application runs as expected. Then again in this approach all modules of the application has to be tested, since one module can work without particular dependency and other may not.
Better solution is to take care while developing. The application developer must be careful in adding a dependency and removing unwanted dependency after his/her piece of code is done.
Global strategy.
1) Find all the classes that are loaded during runtime.
2) List of all the classes available in the classpath.
3) Reduce your class path by creating copies of jars containing only classes you need.
I have done 1 and 2 part so I can help you.
1) Find out all the classes that are loaded. You need 100 % code coverage (I am not talking about tests, but production). So run all possible scenarios, so all the classes your app needs will be loaded and logged.
To log loaded classes try several approaches. Reflection, –verbose:class flag, also you can learn about java agent. It allows to modify methods during runtime. This is an example of some java agent code or another java agent example
2) To find all the classes available in jar, you can write a program. You need to know all places where application jars are placed. Loop throw these jars (You can use ZipFile), loop through ZipFileEntry entries, and collect all classes.
3) After that write a script or program that reassembles your application. For example, now you can create a new jar file for each library and put there only needed classes.
Also you may use a tool (again, you are a programmer, so write a program), which checks code for classes dependence. You do not want to remove classes if they are used for compilation. When I was a student, I wrote code alanyzer, which builds an oriented graph for classes dependencies.
As #Gokul Nath KP notes, I did this before. I manually change gradle and maven dependencies, removing one by one, and then full regression test. It took me a week (our application was small comparing to modern world enterprise systems created by hundreds of developers).
So, be creative, and in case of success, your project will be used by millions!
I want to understand the packing methodology in real big projects.
Suppose we have a package com.abc.xyz, and for this, we really have a path like com/abc/xyz.
Is it possible to have multiple same package names in different directory structure like:
Directory path 1:
/home/user1/project/module1/src/java/com/abc/xyz
Directory path 2:
/home/user1/project/module2/src/java/com/abc/xyz
And finally when we create jar for the whole project, do we create jar with respect to com directory?
When some application uses import com.abc.xyz, how does it know which directory path's package it is referring to?
And finally, is there any good book/resource which gives guidelines about packaging, how to divide project into modules, package names etc.
One more thing, does a project have common package base name like in above case:
com.abc.xyz (e.g., org.apache.hadoop ).
Thanks,
Vipin
Packages created in different source directories are the same package, as far as the classloader is concerned. It also doesn't matter if the class files are in the same jar or different jars. The JVM does not discriminate based on where the source code came from.
(Of course if you have two jars loaded by different classloaders those are going to be treated differently.)
One case where you frequently have different source trees with the same package is when you have tests in a different directory (using the usual Maven convention where the code is under src/main/java and the tests are in src/test/java) but with the same package as the code that they exercise. These tests are able to exercise protected and package-private parts of the code under test, because they're in the same package as that code.
The path of directories inside the jar should start at the root of the package. (The topmost directory should be /, then one called com or org or whatever, etc.) Packages do form a tree-like structure, and when you put your code in a filesystem you end up having a hierarchy of packages, but the language itself doesn't recognize a concept of "subpackage" (except that packages that start with java are special and get special treatment by the classloader).
Organizing code into packages is done differently by different people. Some people like to organize their code by layer (putting all controllers in one package, all services in another package, all daos in still another package), some like to organize their code by feature.
Package-by-layer is the conventional way of organizing code, it seems to be the preferred practice in the Java community. One consequence of this is that when code implements a feature as a vertical slice at right angles to the package structure (as it may require a new controller endpoint, maybe a new service method, etc.), so closely-related bits of code for the same feature end up scattered across different directories. The Java Practices website makes an interesting case for package-by-feature:
Package By Feature Package-by-feature uses packages to reflect
the feature set. It tries to place all items related to a single
feature (and only that feature) into a single directory/package. This
results in packages with high cohesion and high modularity, and with
minimal coupling between packages. Items that work closely together
are placed next to each other. They aren't spread out all over the
application. It's also interesting to note that, in some cases,
deleting a feature can reduce to a single operation - deleting a
directory. (Deletion operations might be thought of as a good test for
maximum modularity: an item has maximum modularity only if it can be
deleted in a single operation.)
Here's an SO question asking about package by feature or layer.
Yes, you could make duplicate packages in separate directories, but I can't think of a good reason to do it. If the classes within the package have the same names you can certainly get namespace collisions. I am not sure what "module" means in this context but I'd recommend
com.abc.module1.xyz
com.abc.module2.xyz
instead. Those would be distinct packages to the classloader. You can still keep your /home/user1/project/module1/ directory structure up front, that doesn't matter.
From 2 modules you will have two seperate jar files: module1.jar and module2.jar. Both will be loaded into ClassLoader when application starts.
When some application uses import com.abc.xyz, how does it know which directory path's package it is referring to?
Classloader will handle that. http://www.javaworld.com/article/2077260/learn-java/the-basics-of-java-class-loaders.html
If you trying to develop multi module application i recommend you to check Maven tool:
http://maven.apache.org/
Why maven? What are the benefits?
For guidance for package organization you can just google 'java packages' phrase.
http://www.tutorialspoint.com/java/java_packages.htm
https://www.facebook.com/Niranthara-Jaya-JavaSocial-Media-Apps-Software-Project-Management-244119296136021/
This page is for people who wish to know how to work with real world Java projects. Send a message to this page and check out the articles.
We have a multi-module POM, which also serves as a parent POM for all sub-modules involved. Call it MultiModulePOM. We have about 70 modules, say numbered Module1 to Module70.
Now: The first 30 of these modules require a set of JAR files at compile-time only. That is - scope=provided. Since we're talking about a set of JAR files, it is quite tedious to keep those 30 modules in sync and in general, I am not a huge fan of copying definitions around.
So, I fell into the pitfall of dependency grouping. Seemed like a good idea, however it doesn't work for provided dependencies. In other words: if I group the dependent JARs in a module called ExtDependencies, and make Module1 depend on ExtDependencies, the JARs referred-to by ExtDependencies won't be transitively-added to Module1, because their scope is provided.
(If the last paragraph is not true, please let me know as it could really get me out of a jam)
The only other option that I could see was to create a parent POM called (for example) IntermediaryPOM. IntermediaryPOM extends MultiModulePOM and enlists the set of dependent JAR files with scope=provided. Modules Module1-Module30 then extend IntermediaryPOM.
That seemed to do the trick but I have three problems with it:
It adds another layer of POM that I'm not sure is really needed.
Later, during distribution time, I find myself having to install/deploy the intermediary POM's as well.
Consider the general case: the intermediary POM may have other siblings used for other sets of JARs (for modules 31-50). Therefore, this solution doesn't seem to scale well.
So my question is - according to your experience, what is the best way to approach this? any known best practices for such a use case?
I'm afraid there is no easy solution here.
You're right saying that if you declare common dependencies in ExtDependencies as provided they won't be added to the classpath of any other module that is dependent on ExtDependencies. That's how provided works.
But you could declare these common dependencies without scope (e.g. with default compile scope) and add provided dependency on ExtDependencies. In this case all of the ExtDependencies dependencies will be added to classpath. God, that's a lot of "dependencies" :)
You've also mentioned other possible option -- introduce another level of abstraction (which, as you might know, is a way to solve almost any problem). But such multi-level hierarchy is less elegant and more difficult to maintain (I have it in our projects, so I've been there).
In general, I haven't come across this problem in such a scale but if I were to solve it I'd go with the first option taking into account scoping suggestion.
Whats the best practice for setting up package structures in a Java Web Application?
How would you setup your src, unit test code, etc?
You could follow maven's standard project layout. You don't have to actually use maven, but it would make the transition easier in the future (if necessary). Plus, other developers will be used to seeing that layout, since many open source projects are layed out this way,
There are a few existing resources you might check:
Properly Package Your Java Classes
Spring 2.5 Architecture
Java Tutorial - Naming a Package
SUN Naming Conventions
For what it's worth, my own personal guidelines that I tend to use are as follows:
Start with reverse domain, e.g. "com.mycompany".
Use product name, e.g. "myproduct". In some cases I tend to have common packages that do not belong to a particular product. These would end up categorized according to the functionality of these common classes, e.g. "io", "util", "ui", etc.
After this it becomes more free-form. Usually I group according to project, area of functionality, deployment, etc. For example I might have "project1", "project2", "ui", "client", etc.
A couple of other points:
It's quite common in projects I've worked on for package names to flow from the design documentation. Usually products are separated into areas of functionality or purpose already.
Don't stress too much about pushing common functionality into higher packages right away. Wait for there to be a need across projects, products, etc., and then refactor.
Watch inter-package dependencies. They're not all bad, but it can signify tight coupling between what might be separate units. There are tools that can help you keep track of this.
I would suggest creating your package structure by feature, and not by the implementation layer. A good write up on this is Java practices: Package by feature, not layer
The way I usually organise is
- src
- main
- java
- groovy
- resources
- test
- java
- groovy
- lib
- build
- test
- reports
- classes
- doc
I usually like to have the following:
bin (Binaries)
doc (Documents)
inf (Information)
lib (Libraries)
res (Resources)
src (Source)
tst (Test)
These may be considered unconventional, but I find it to be a very nice way to organize things.
The way i usually have my hierarchy of folder-
Project Name
src
bin
tests
libs
docs
One another way is to separate out the APIs, services, and entities into different packages.
I have a large application (~50 modules) using a structure similar to the following:
Application
Communication modules
Color communication module
SSN communication module
etc. communication module
Router module
Service modules
Voting service module
Web interface submodule for voting
Vote collector submodule for voting
etc. for voting
Quiz service module
etc. module
I would like to import the application to Maven and Subversion. After some research I found that two practical approaches exists for this.
One is using a tree structure just as the previous one. The drawback of this structure is that you need a ton of tweaking/hacks to get the multi-module reporting work well with Maven. Another downside is that in Subversion the standard trunk/tags/branches approach add even more complexity to the repository.
The other approach uses a flat structure, where there are only one parent project and all the modules, submodules and parts-of-the-submodules are a direct child of the parent project. This approach works well for reporting and is easier in Subversion, however I feel I lose a bit of the structure this way.
Which way would you choose in the long term and why?
We have a largish application (160+ OSGi bundles where each bundle is a Maven module) and the lesson we learned, and continue to learn, is that flat is better. The problem with encoding semantics in your hierarchy is that you lose flexibility. A module that is 100% say "communication" today may be partly "service" tomorrow and then you'll need to be moving things around in your repository and that will break all sorts of scripts, documentation, references, etc.
So I would recommend a flat structure and to encode the semantics in another place (say for example an IDE workspace or documentation).
I've answered a question about version control layout in some detail with examples at another question, it may be relevant to your situation.
I think you're better off flattening your directory structure. Perhaps you want to come up with a naming convention for the directories such that they sort nicely when viewing all of the projects, but ultimately I don't think all of that extra hierarchy is necessary.
Assuming you're using Eclipse as your IDE all of the projects are going to end up in a flat list once you import them anyway so you don't really gain anything from the additional sub directories. That in addition to the fact that the configuration is so much simpler without all the extra hierarchy makes the choice pretty clear in my mind.
You might also want to consider combining some of the modules. I know nothing about your app or domain, but it seems like a lot of those leaf level modules might be better suited as just packages or sets of packages inside another top level module. I'm all for keeping jars cohesive, but it can be taken too far sometimes.