Prevent Internals leaking into API

Prevent Internals leaking into API - java

I'm looking for different ways to prevent internals leaking into an API. This is a huge problem because once these internals leak into the API; you can run either into unexpected incompatibility issues or into frozen internals.
One of the simplest ways to do so is just make use of different Maven modules; one module with API and one module with implementation. This way it is impossible to expose the implementation from the API.
Unfortunately not everyone agrees this is the best approach; But are there other alternatives? E.g using checkstyle or other 'architecture checking' tools?
PS: Java 9 for us is not usable, since we are about to upgrade to Java 8 and this will be the lowest supporting version for quite some time to come.

Following your checkstyle idea, it should be possible to set up rules which examine import statements in source files.
Checkstyle has built-in support for that, specifically the IllegalImport and ImportControl rules.
This of course works best if public and internal classes can be easily separated by package names.
The idea for IllegalImport would be that you configure a TreeWalker in checkstyle which only looks at your API-sources, and which excludes imports from internal packages.
With the ImportControl rule on the other hand you can define very detailed access rules for the whole application/module in a separate XML file.

It is standard in Java to define an API using interfaces and implement them using classes. That way you can change the "internals" however you want and nothing changes for the user(s) of the API.

One alternative is to have one module (Jar file) for API and implementation (but then again, is it an API or just any kind of library?). Inside one separates classes and interfaces by using packages, e.g. com.acme.stuff.api and com.acme.stuff.impl. It is important to make classes inside the latter package protected or just package-protected.
Not only does the package name show the consuming developer "hey, this is the implementation", it is also not possible to use anything inside (let's omit reflections at this point for the sake of simplicity).
But again: This is against the idea of an API, because usually the implementation can be changed. With this approach one cannot separate API from implementation, because both are inside the same module.
If it is only about hiding internals of a library, then this is one (not the one) feasible approach.
And just in case you meant a library instead of an API, which only exposes its "frontend" (by using interfaces or abstract classes and such), use different package names, e.g. com.acme.stuff and com.acme.stuff.internal. The same visibility rules apply of course.
Also: This way one does not need Checkstyle and other burdens.

Here is a good start : http://wiki.netbeans.org/API_Design
Key point : Do not expose more than you want Obviously the less of the implementation is expressed in the API, the more flexibility one can have in future. There are some tricks that one can use to hide the implementation, but still deliver the desired functionality
I think you don't need any checkstyle or anything like that, just a good old solid design and architecture should be enough. Polymorphism is all you need here.
One of the simplest ways to do so is just make use of different Maven
modules; one module with API and one module with implementation. This
way it is impossible to expose the implementation from the API.
Yes, I totally agree, hide as much as possible, separate your interface in a standalone project.

Related

Is there a maven plugin that augments Java access control?

Is there a maven plugin that makes mvn verify of an aggregating project fail when its submodules or their transitive dependencies depend on things they oughtn't.
I'd like to be able to restrict uses of public APIs to express policies like
Only classes or packages on a whitelist can invoke this public constructor/method.
This public setter that was produced by a code generator should not be called -- it should really have been package-private.
Motivation & Caveats
I realize that there are ways to work around these requirements using reflection and deserialization. My end goal is to allow system-architects & tech-leads to set a policy like
All uses of security-critical APIs should be in modules reviewed by security. Contact them if you need the whitelist expanded.
These deprecated APIs are banned in favor of new ones. There's a whitelist for grandfathered code which should shrink over time.
The system architect treats trusts application developers but we want naive policy violations flagged with useful error messages, and we want developers who hack around the policy to not be able to plausibly deny that they did so.
Tricks like reflection and deserialization fall into that not-plausibly-deniable hacking.
This is kind of like some of the aims of Jigsaw, where a module (group of packages) can declare that its public interface is limited to just some packages, but jigsaw isn't widely available.
This question differs from "Make Java methods visible to only specific classes" because I'm not asking about ways to do this from within the Java language.

You can use checkstyle to perform such checks, for your use case you could use import control:
It seems that this doesn't support fully-qualified imports, based on following answers:
Checkstyle rule to limit interactions between root packages (with ImportControl?)
How to prevent fully qualified names in Java code
As the second answer suggest you could work around that by forbidding fully qualified imports by using another tool - PMD.
As for JSPs, these are usually compiled in the servlet container, nevertheless there is a way to pre-compile these as well, using maven plugin.

Java SE - Clever way to implement "plug and play" for different library modules

I'm trying to do something clever. I am creating a weather application in which we can replace the weather API with another weather API without affecting the code base. So I started with a Maven project with multiple modules.
I have a Base module that contains the Interface class and the Base class. The Interface class contains the calls to the APIs (all calls are similar, if not exact) and the Base class contains the properties to the APIs (again, all properties are similar, if not exact).
I have a module for each of the two weather APIs we are testing with plans to create more modules for new weather APIs as we grow the application.
Finally, I have created a Core module (includes main) to implement the specific module class for the weather API I want to test.
Now, I know the simplest way to do this would be to use a switch statement and enumeration. But I want to know if there is a more clever way to do this. Maybe using a Pattern? Any suggestions?
Here is a picture of the structure I have just described:
Here is the UML representation:
This is a learning process for me. I want to discover how a real Java Guru would implement the appropriate module and class based on a specified configuration.
Thank you for your suggestions.

I'm trying to do something clever. I am creating a weather application
in which we can replace the weather API with another weather API
without affecting the code base.
Without reading further down, this first statement makes me think about a plugin architecture design, but in the process of software design, decisions must not be rushed, the more you delay, the more information you have and a better informed decision can be made, for now is just an idea to keep in mind.
I have a Base module that contains the Interface class and the Base
class. The Interface class contains the calls to the APIs (all calls
are similar, if not exact) and the Base class contains the properties
to the APIs (again, all properties are similar, if not exact).
When different modules share behaviour/state, it is a good idea to refactor them and produce base abstract classes and interfaces, so you are on the right track, but, if there are differences, those shouldn't be refactored into the base module. The reason behind that is simple, maintainability. If you start adding if clauses or switches to deal with these differences, you just introduced coupling between modules, and you'll be always having to make changes in the base module, whenever you add/modify other modules, and this is not desirable at all.
This is reflected by the Open/Closed principle form the SOLID principles, which states that a class should be open for extension but closed for modifications.
So after you've refactored the common behaviour into the base modules, then each new API should extend the base module, as you did.
Finally, I have created a Core module (includes main) to implement the
specific module class for the weather API I want to test.
Now, I know the simplest way to do this would be to use a switch
statement and enumeration. But I want to know if there is a more
clever way to do this. Maybe using a Pattern? Any suggestions?
Indeed, making use of a switch, makes it work, but its not a clean design at all, for the same reason as before, when adding, modifying or removing modules, would require to modify this module aswell, and also this code can potentially break.
One possible solution, would be to delegate this responsability on a new component and make use of a creational design pattern like the Abstract Factory, which will provide a interface to instantiate components without specifying its classes.
As for the architecture, so far, the plugin architecture still makes sense, but what if the different modules extend the base contract adding more features? One option is to use the Facade pattern to adapt the module calls and provide an output that implements an interface that clients expect.
But then again, with the provided details, this is the solution I'd suggest, but the scenario should be studied carefully and in greater detail, in order to be able to assure that these are the right tools for the job, and commit to them.

In addition to Salvador Juan Martinez's answer...
To implement a plugin architecture Java's Jar File Specification provides support for service provider interfaces (SPI) and how they are looked up.
As of Java 1.6. you can use the ServiceLoader to lookup service providers. For Java 1.5. and less you must do it on your own or use a library. E.g. commons-discovery.
The usage is quiet simple. In your case put a META-INF/services/com.a2i.weatherbase.IWeather file in each plugin module.
In the Weather Forecast IO module the file should contain only one line
com.a2i.weatherforecastio.ForecastIO
The line must be the full quallified name of an IWeather implementation class.
Do the same for the other module and you can load the implementations via ServiceLoader.
ServiceLoader<IWeather> weatherServicesLoader = ServiceLoader.load(IWeather.class);
Iterator<IWeather> weatherServices = weatherServicesLoader.iterator();
Now it depends on your runtime classpath how many services will be found. Try to add and remove module jar archives from the classpath and run your application.
EDIT
I wrote a blog about a pluggable architecture with standard java. See http://www.link-intersystems.com/blog/2016/01/02/a-plug-in-architecture-implemented-with-java/
Source code is also available at https://github.com/link-intersystems/blog/tree/master/java-plugin-architecture

One solution is you have to define the common interface with all the identified common operations. The extensions/plugins need to implement that interface and have to provide the implementation to common operations.
You can use an abstract factory design pattern to hook up the exact implementation at runtime based on the input parameters.
Interfaces and abstract classes are always good in such scenarios, Thanks.

Java - make a library and import optional

I have a library that I'm using in an Java application - it's important for certain functionality, but it's optional. Meaning that if the JAR file is not there, the program continues on without issue. I'd like to open source my program, but I can not include this library, which is necessary to compile the source code as I have numerous import statements to use the API. I don't want to maintain two code sets. What is the best way to remove the physical jar file from open source release, but still maintain the code to support it where other people could still compile it?

the typical approach taken is to define the wrapper API (i.e. interfaces) and include those interfaces in the open sourced code, and then provide configuration options where one can specify class names of classes that implement certain interfaces.
You will import API interfaces instead of importing classes directly into your open sourced code. This way, you are open sourcing the API but not the implementation of the parts that you do not want to open source or you cannot open source.
There are many examples, but take a look at JDBC API (interfaces) and JDBC drivers (implementation classes) for starters.

I was pretty much typing the same thing as smallworld with one addition. If this API were necessary you can use a project build tool like Maven to handle the dependencies on you project. If someone checks it out from source control with the pom they can download the dependencies for themselves and you don't have to include them in a source repo.

There's probably a number of ways to fix this, here's a couple I can think of:
If you have only a couple of methods you need to invoke in the 3rd party library, you could use reflection to invoke those methods. It creates really verbose code, that is hard to read though.
If you don't have too much of the API in the 3rd party library you use, you could also create a separate JAR file, containing just a non-functional shell of the classes in the library (just types with the same names and methods with the same signatures). You can then use this JAR to distribute and compile against. At run-time you'd replace it with the real JAR if available.
The most common way is probably to just create a wrapper API in a separate module/project for the code that is dependent on the 3rd party library, and possibly distribute a pre-built JAR. This might go against your wish to not maintain two code sets, but may prove to be the best and less painful solution in the long run.

Java package politics

I always doubt when creating packages, I want to take advantage of the package limited access but at the same time I want to have similar classes divided into packages.
The problem comes when you understand that packages are not hierarchical in Java:
At first, packages appear to be
hierarchical, but they are not.
source
Imagine I have an API defined with its classes at foo.bar, only the classes the API client needs are set public. Then I have another package with some internal objects I need in the API defined at foo.bar.pojos, this classes need to be public so they can be accessed by foo.bar but this means the API client could also access them if the package foo.bar.pojos is imported.
What is the common package politic that should be followed?

I've seen two ways of doing.
The first one consists in separating the public API and internal classes into two different artefacts (jars). The documentation is separated as well, and it's thus easy for the end user to make the distinction between what is internal and what is not. But it sometimes make things more complex to have two jars, two source trees, etc.
The second one consists in delivering a single jar, but have a good documentation allowing to know what's internal and what's not. The textual documentation can explain how to use the API (and thus avoids talking about the internals). And the javadoc can specify that a class is for internal use and is thus subject to changes.

Yes, Java packages don't give you enough control over your dependencies. The classic way to deal with this is to put external APIs in one package and internal implementation classes in another, and rely on people's good sense to avoid creating dependencies on the latter.
With Maven and OSGI, you have an additional mechanism for managing dependencies between modules / bundles of packages. In the case of OSGI, you can explicitly declare some packages as not exported, and an OSGI aware development environment will prevent people creating harmful dependencies. Maven's module support is weaker, but at least it controls dependency cycles.
Finally, you could use custom PMD rules to enforce your project's modularization conventions ... in the same way that there are rules to discourage dependencies on Java's "com.sun.*" package tree.

It is a mess.
Using only what Java itself offers, you have to put everything in the same package. You end up with a single (or a few) packages with lots of classes, and no good way to group them for yourself (but at least that problem does not leak outside). Most people don't do that, though, and as a result, your (as a developer on top of these libraries) public classpath is littered with stuff you should never need to see.
You might like OSGi, which has (and enforces) the concept of bundle-private packages. Those are not exported to the outside world.

I have what must surely be a fairly common documentation need...
I'm implementing a rather sizable Java library code base that has, among other things, various classes intended to be exposed to a caller/implementor at the appropriate level of abstraction. At the same time, the code base contains, of course, various internal classes, interfaces, and other abstractions that the user of the library doesn't need to know about in order to use the API.
Lots of other API libraries out there make the mistake of simply throwing everything into the Javadocs, and leaving it up to the user to figure out which objects and entities they actually need to deal with as a caller through some combination of guesswork, inference, and, if you're lucky, example code.
I don't want to be in that same position. I would like to have an "internal" set of Javadocs that expose the entire extent of the codebase, and an "external" set of Javadocs intended to clearly communicate to the developers the characteristics of the classes that they actually need to use to get their work done. I don't need or want to muddy the waters with various internal abstractions that they don't need to see or know about - there's no need for them to know how it all works under the hood, and it would just confuse and misdirect them, making for a very inefficient API learning process.
How can I accomplish this? Is there a well-known combination of arguments to 'javadoc' and perhaps some annotations that can make this happen?
Thanks very much for your consideration!

Assuming that you have followed best-practice and put your internal classes in different packages to your public APIs, you can run javadoc with the public API package names as command line arguments.
Refer to the javadoc command line synopsis for more details.
(If you haven't organized your packages to keep internal classes out of API packages, you may be in for a bit of pain ...)

In addition to Stephen C's answer and using the javadoc tool, you can specify exactly which packages appear in the javadoc (hence Stephen C's comment about 'pain' if they aren't organised logically) using something like this:
Say you have 5 classes and you want only the classes in the org.notprivate package to appear in the Javadoc:
org.notprivate.Foo
org.notprivate.Bar
org.notprivate.Stuff
org.notpublic.Things
org.notpublic.More
You can use something like:
javadoc -d target/api -source 1.6 -sourcepath src/main/java org.notprivate
That's just a quick example, if you need to specify each class you'll need to look at the link Stephen C provided in more detail
Posted here for clarity:
Javadoc Documentation

I would like to have ... an "external" set of Javadocs intended to clearly communicate to the developers the characteristics of the classes that they actually need to use to get their work done. I don't need or want to muddy the waters with various internal abstractions that they don't need to see or know about - there's no need for them to know how it all works under the hood, and it would just confuse and misdirect them, making for a very inefficient API learning process.
Given this desire, perhaps Javadoc isn't the best method of documenting the overall system view or for giving a "here's what you need to know"-type info to new developers?
I would recommend supplementing your Javadoc files with a separate guide/document/wiki/something to give the meta-view.

You can use some extra arguments when invoking the javadoc tool :
-public : Shows only public classes and members.
-protected : Shows only protected and public classes and members. This is the default.
-package : Shows only package, protected, and public classes and members.
-private : Shows all classes and members.
So, with these options you can generate a full documentation for internal usage, and give a 'light' documentation with only the public interface to your customers.
If you're using Eclipse, the Javadoc wizard shows radio buttons to help you choose the documentation level - which is "public fields only" by default.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.