Java: Restricting visibility in large projects

Java: Restricting visibility in large projects - java

How can I manage (=restrict) visibility in large Java projects?
More specifically:
Given the features "Big-A", "Big-B", some sub-features "Little-A-1", "Little-A-2" and some sub-sub-features "Tiny-A-1-I" and "Tiny-A-1-II". With the "Little-A-" features being sub-features of "Big-A" and the "Tiny-A-1-" features being sub-features of "Little-A-1".
There are classes and methods in "Little-A-1" which need to be visible for its enclosing feature "Big-A" and all its (sub-)sub-features, but not for "Big-B". Other classes and methods should only be visible for "Little-A-1" and its sub-features, but not for "Big-A" and not for "Little-A-2". And of course "Tiny-A-1-I" has classes and methods, too, which should only be visible within "Tiny-A-1-I" but not for "Tiny-A-1-II", or only within "Little-A-1" and its sub-features, but not for "Little-A-2" and so on.
In short: I have a large hierarchy of features and need to restrict visibility to parts of that hierarchy. How can this be done in Java, given that the built-in visibilities are not powerful enough for that?
Is there any Java built-in feature that can be used for that? Or any external project? (Or an Eclipse plugin, at least?) I looked at Annotation processors, but it seems they are not powerful enough, too, as they have no access to the actual code. Only to its static structure. Given that Java is used for many big projects, I guess there has to be some solution to that problem. I am probably not the only one having that problem.
The ideal solution would allow me to apply it step wise feature by feature and not require that I have to adapt all the code at once, as that is not possible for me. (Lack of time and resources.) And integration into Eclipse (showing errors when an invisible element is accessed) would be nice, too. :-)

Related

Find out used classes and methods from Java source code

For Java source files, I would like to find out:
Which classes use which other classes (fully qualified names)?
Which methods call which other methods (fully qualified names)?
What would be a reasonable way to achieve that?
EDIT:
To clarify: I want a list of source code files as input. The output should be (as specified above) which class uses which other class and which method calls which other method. I do not want to inspect other loaded classes at runtime, like when using reflection.

You need to use static analysis tool as STAN standalone mode:
The standalone application is targeted to architects and project managers who are typically not using the IDE.
Or JArchitect (available also using command line)
JArchitect is a powerful tool for static code analysis. It can provide a lot of insight into complex code bases. Using custom code queries you are able to build your own rule sets in a very comfortable way.
In the Class Browser right-click menu, JArchitect proposes to explore the graph of dependencies between members (methods + fields) of a type.
Another option is SourceTrail
The graph visualization provides a quick overview of any class, method, field, etc., of interest and all its relations. The graph is fully interactive. Use it to move through the codebase by focusing on related nodes and edges.
(source: sourcetrail.com)

Unfortunately, reflection doesn't give you all the information you need to do this.
I've done it with ASM (https://asm.ow2.io/).
It provides the ability to walk the byte code of all of your classes using the visitor pattern, including the actual method implementations, from which you can extract the references to other classes.
I'm sorry that I cannot provide the implementation, because it's proprietary.
Note that this works from your .jar files, not your sources. If you really need to work from sources, then have a look at https://github.com/javaparser . Really, though, it's better to use the byte code, since the java language changes frequently, while the byte code specification does not.

I am not sure how to get a listing, but for identifying refactoring opportunities, you might try IntelliJ IDEA. It will dull out the signature line of any methods that are not accessed in the project. It will also detect code segments that are repeated elsewhere in the project, so you can extract common code.

Prevent Internals leaking into API

I'm looking for different ways to prevent internals leaking into an API. This is a huge problem because once these internals leak into the API; you can run either into unexpected incompatibility issues or into frozen internals.
One of the simplest ways to do so is just make use of different Maven modules; one module with API and one module with implementation. This way it is impossible to expose the implementation from the API.
Unfortunately not everyone agrees this is the best approach; But are there other alternatives? E.g using checkstyle or other 'architecture checking' tools?
PS: Java 9 for us is not usable, since we are about to upgrade to Java 8 and this will be the lowest supporting version for quite some time to come.

Following your checkstyle idea, it should be possible to set up rules which examine import statements in source files.
Checkstyle has built-in support for that, specifically the IllegalImport and ImportControl rules.
This of course works best if public and internal classes can be easily separated by package names.
The idea for IllegalImport would be that you configure a TreeWalker in checkstyle which only looks at your API-sources, and which excludes imports from internal packages.
With the ImportControl rule on the other hand you can define very detailed access rules for the whole application/module in a separate XML file.

It is standard in Java to define an API using interfaces and implement them using classes. That way you can change the "internals" however you want and nothing changes for the user(s) of the API.

One alternative is to have one module (Jar file) for API and implementation (but then again, is it an API or just any kind of library?). Inside one separates classes and interfaces by using packages, e.g. com.acme.stuff.api and com.acme.stuff.impl. It is important to make classes inside the latter package protected or just package-protected.
Not only does the package name show the consuming developer "hey, this is the implementation", it is also not possible to use anything inside (let's omit reflections at this point for the sake of simplicity).
But again: This is against the idea of an API, because usually the implementation can be changed. With this approach one cannot separate API from implementation, because both are inside the same module.
If it is only about hiding internals of a library, then this is one (not the one) feasible approach.
And just in case you meant a library instead of an API, which only exposes its "frontend" (by using interfaces or abstract classes and such), use different package names, e.g. com.acme.stuff and com.acme.stuff.internal. The same visibility rules apply of course.
Also: This way one does not need Checkstyle and other burdens.

Here is a good start : http://wiki.netbeans.org/API_Design
Key point : Do not expose more than you want Obviously the less of the implementation is expressed in the API, the more flexibility one can have in future. There are some tricks that one can use to hide the implementation, but still deliver the desired functionality
I think you don't need any checkstyle or anything like that, just a good old solid design and architecture should be enough. Polymorphism is all you need here.
One of the simplest ways to do so is just make use of different Maven
modules; one module with API and one module with implementation. This
way it is impossible to expose the implementation from the API.
Yes, I totally agree, hide as much as possible, separate your interface in a standalone project.

Naming convention for homonymous classes in different libraries

In any project with relatively big number of dependencies there are always a lot of commongly named classes in different libraries. For example, Configuration is very widely used:
It slows down the programmer as he has to carefully pick the right class from the list. It is also very irritating if you have to use different configurations in one class, so they have to be prepended with full package name.
I'm writing a library which also needs a Configuration class. Should I use this name? Or is it better to name it {Libname}Configuration? Is their any common way to avoid such problems?

I think you should name your class in the way that is most clear of it's usage and you shouldn't care too much if another classes with the same name exist. As you know, the usage of packages reduces the risk of naming collisions.
For any project I think it's good that the whole team should have a convention related to basic naming and formatting used. It's important to use a consistent naming convention so that when new people work on it, they pick it up faster. I think that conventions also help increase productivity since it's easier to remember names.
I think it's good to spend some time thinking about classes, not only in terms of algorithms but also as what business part they fill. To think about why is a class necessary and what brings to the project, can make you more aware of the way your method/class/variable works within the application workflow.
That being said, I think that maybe your IDE has some option to hide some of the classes is shows. I'm using IntelliJ and it has a feature for this situation, even though it's a bit hidden.

I think it is usually not a very good idea to start with library name for a class, simply because in the long run that will make it more difficult to remember, and because it diminishes readability.
There are ways to setup your IDE (depending on which IDE you use) so that autocomplete shows the most used classes first. You can also get to a class quickly by first typing the name, then the library, when using autocomplete. These are all dependent on you IDE. But generally it seems like a bad practice to start a classname with the name of the library.

Tool to determine transitive Java imports?

Is there some sort of tool that you can point at a set of Java classes, and it produces output showing the transitive imports of each class?
I understand that imports are not "transitive" from the point of view of the language itself - i.e. if com.acme.X imports com.acme.Y, and com.acme.Y imports com.acme.Z, that does not mean that you can refer to com.acme.Z within com.acme.X. But that's not what I mean:
Rather, I mean that com.acme.X nonetheless depends upon com.acme.Z (at least under the current implementations of X and Y), and I want to know that fact. In fact I want to know it for a large number of classes, and so I'm hoping that there's a tool do determine it automatically.
Either a standalone tool or an Eclipse plugin or feature would be great.
Thanks in advance.
EDIT to hopefully show what I want this for:
I have a huge monolithic jar that contains many features that are (essentially) completely unrelated. I'd like to break it apart into several smaller, more manageable, and more self-coherent jars.
Unfortunately, I can't do it simply by breaking it up based on packages, because many of the packages themselves are not self-coherent either. That is, for example, there's a "com.acme.utils" package. Two things in that package are probably have nothing in common except for the fact that they're both, in some sense, "utilities". One may be a utility for some particular business function, another may be TCP/IP utilities, another may be a set of string utilities, another may be some completely unrelated business function.
And there are a bunch of packages like this. So when you look at the transitivity of imports from the point of view of packages, they snowball without limit, and so more or less everything in the monolithic jar depends on everything else in the monolithic jar.
So I'd like to start by considering transitivity of imports from the class point of view, rather than the package point of view. That way I should be able to more easily determine what classes need to be reorganized from what existing packages into new, more coherent packages, and then after that I can break the monolithic jar apart by packages / sets of packages.

we're using sonar for software metrics. http://www.sonarsource.org/

what is the benefit in dynamically generating java bean classes from xml?

I had written a lot of java bean classes using my IDE. Another person suggests a different approach. He suggests that I put an xml file with bean definitions in them. Then I either use jaxb or xslt to dynamically generate the classes during build time. Though its a novel and interesting approach, I do not see any major benefit in it.
I see only one benefit in this suggested approach : The java bean classes need not be maintained in configuration control. Any bean changes is going to require only an update in the xml file.
Are there any major benefits in dynamically generating java classes ? Is there any other reason why this approach is taken ?

I agree with #Akhilss. My experiences have been in large scale Java EE projects where code generation is common.
It all depends on your project. If you are coding only a few beans and only need basic functionality then I don't see the need to start with XML (Which is often over used anyway). Especially if you actually don't need the XML as well.
However if you are building a system which needs the XML, an example being a SOAP web service WSDL and schema, then generation is a good idea because it saves you from manually keep schemas and beans in sync. As well as providing factory classes and other support code.
As a counter argument, with EJB3 and similar standards, it's now often easier to write the beans and generate the messy XML stuff on the fly. Ie. let the server do the grunt work.
Another reason to consider code generation is if you require more complex functionality in your beans because they represent data structures. A few years ago I trialled the Apache Tuscany project for generating SDO beans from XML. The nice thing about that was that I could generate functionality like property change notifications so when I modified any of the bean's properties (including collections), other parts of your program could be notified automatically. Generated functionality like that can save you a lot of time and money if you need it.
Ultimately, I'd suggest adhering to the KISS principle. So don't add what you don't need. Generated code from XML is useful if it helps you in the long run. But like any technology, be sure you are adding it for the right reasons.

I have used Jibx and its generator in my project. My experience has been mixed.
The usual case for using JAXB's (XJC) generator is referred to in http://static.springsource.org/spring-ws/site/reference/html/why-contract-first.html
Conversion to and from XML maked it possible to store in the DB and retrieve for future use as well as use for test case input for functional tests.
Using any kind of generator (Jaxb,Jibx,XMLBeans,Custom) might make sense for large sized projects. It allows for standardization of data types (like BigDecimal for financial amounts, like ArrayList for all lists), forcing interfaces (like Serializable or Cloneable). This enforces good practices and reduce the need for reviews of generated files.
It allows for injection of code through XSLT or post processing of generated java file. Example is to inject Rounding code to a specific decimal size(2,6,9) with a specific policy (UP,DOWN,NEAR) within the setter method for each field of type financialAmount. Forcing such behavior does reduce the instance of bugs(for incorrect financial values which companies are liable for).
The disadvantage are
Usually each java class can be only a bean class. Any customization made will be overwritten. Since (in my case) the generator is tied in to the build process. The classes get generated with every build.
You cannot do implementation of your custom interfaces on a bean class or add annotations for your own or third party frameworks.
You cannot easily implement patterns like a factory method since default constructors are usually generated. Refactoring is usually difficult since generators do not usually support it.
You may(not sure now, was true a couple of years ago for Jibx) not be able to generated ENUMS when it would be most applicable.
You may not be able to override the default datatype with your own regardless of the need. CopyOnWrite list and not ArrayList for a variable shared across threads or a custom implementation of a List which also implements the Observer pattern.
The benefits of a generator outweigh the costs for large sized (in persons and not code, think 150 developers in three locations) distributed projects. You can work around the disadvantages by defining your custom classes which contain the bean and implements behaviour or post processing (adding additional code) with further metadata picked up from XSD annotations or another configuration file. Remember support and Maintenance of the generator become critical since the entire project depends on it. Use it with CAUTION.
For smaller sized projects I personally would write my own classes. For larger sized projects I personally would not use it in the middle tier mostly because of the lack of refactoring support. It can be used for simple beans meant to be bound to UI frameworks.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.