Determining what minimal jars are needed for a feature - java

How do you determine what jars are needed for such and such feature of a framework? For example, what jars would be needed out of all those available for Spring in order to support only dependency injection?

There are tools that create minimal JARs by figuring out which classes are actually used in an application by statically analyzing the code, then creating a new JAR containing only those classes. (I recall using Zelix Classmaster to do this, but there are many alternatives.)
The problem with using these tools for a DI framework like Spring include:
The existing only trace static dependencies. If you dynamically load classes, you have to specifically tell the analyser about each one. DI frameworks in general, and Spring in particular is replete with dynamic loading, including dynamic loading that is opaque to application code.
The existing tools work by creating a new output JAR, not by telling you which of the input JARs are not used. While repackaging the JARs is OK if you are creating a shrink-wrapped application from a closed-source codebase, it is undesirable in general, and potentially problematic with some open-source licenses. Certainly you don't want to do this with Spring.
In theory, someone could write a tool to help. In practice, the tool would need to (for example) know how to extract dynamic class dependencies from Spring configurations expressed in annotations, XML and from bean descriptors created at runtime from higher order configuration (SpringSecurity does this for example). That is a big ask. And even then you have the problem that a "small" change to the wirings made on the installation platform could fail due to a required JARs having been left out by the JAR pruning process.
In my view, the more practical alternatives are:
If you use Maven / Ivy to manage your dependencies, look at the dependency graphs, strip out dependencies that appear to be no longer needed ... and test, test, test.
Manually strip out JARs that appear to be unused ... and test, test, test.
Don't worry about it. A moderate level of unused JAR cruft might add a second or three to deployment and webapp startup times, but that generally doesn't matter. (But if it does ... see above.)

This is why some older Java projects end up having 600 Jars and a 200 MB war file, for a 10,000 line application. Kind of a pain if you don't manage it carefully...

You should really ask the framework provider or read the documentation. Statically analyzing what jars are required might not be enough in some cases(dynamic loading) and sometimes you might end up with too many jars.
I once did some ftp helper stuff to a sort of "utility" library. It depended on some apache ftp jar. If you never used the ftp features in the library you would not need the ftp jar but statical analysis of the code might say you need it. This is something you should documents.

Related

Library does not find its own "sublibrary"

I'm trying to make an addresslist in Java, which saves its contents in a Sqlite database.
Therefor (and for other future uses), I tried to create my own library for all kinds of database connections ("PentagonsDatabaseConnector-1.0.jar"). It currently supports Sqlite and MySql.
It references other libraries for them to provide the JDBC-drivers ("mysql-connector-java-8.0.16.jar" and "sqlite-jdbc-3.30.1.jar").
Problem: My Library works just fine if I'm accessing it from its own project folder, but as soon as I compile it and add it to the "Adressliste"-project, it isn't able to find the JDBC-drivers anymore (I can access the rest of my self-written library without problems though). Also, as shown in the screenshot, "PentagonsDatabaseConnector-1.0.jar" brings the JDBC-libraries with itself in "lib"-folder.
LINK TO THE SCREENSHOT
Do you guys have an idea whats wrong?
Thank you for your help!
Ps: Sorry for bad English, I'm German :)
Java cannot read jars-in-jars.
Dependencies come in a few flavours. In this case, PentagonsDC is a normal dependency; it must be there at runtime, and also be there at compile time.
The JDBC libraries are a bit special; they are runtime-only deps. You don't need them to be around at compile time. You want this, because JDBC libraries are, as a concept, pluggable.
Okay, so what do I do?
Use a build system to manage your dependencies is the answer 90%+ of java programmers go to, and what I recommend you do here. Particularly for someone starting out, I advise Maven. Here you'd just put in a text file the names of your dependencies and maven just takes care of it, at least at compile time.
For the runtime aspect, you have a few options. It depends on how your java app runs.
Some examples:
Manifest-based classpaths
You run your java application 'stand alone', as in, you wrote the psv main(String[]) method that starts the app and you distribute it everywhere it needs to run. In this case, the usual strategy is to have an installer (you need a JVM on the client to run your application and neither oracle nor any OS vendor supports maintaining a functioning JVM on end-user's systems anymore; it is now your job – this is unfortunately non-trivial), and given that you have that, you should deploy your jars such that they contain in the manifest (jars are zips, the manifest ends up at META-INF/MANIFEST.MF):
Main-Class: com.of.yourproj.Main
Class-Path: lib/sqlite-jdbc.jar lib/mysql-jdbc.jar lib/guava.jar
And then have a directory stucture like so:
C:\Program Files\yourapp\yourapp.jar
C:\Program Files\yourapp\lib\sqlite-jdbc.jar
C:\Program Files\yourapp\lib\mysql-jdbc.jar
Or the equivalent on any other OS. The classpath entries in the manifest are space separated and resolved relative to the dir that 'yourapp.jar' is in. Done this way, you can run yourapp.jar from anywhere and it along with all entries listed in Class-Path are now available to it.
Build tools can make this manifest for you.
Shading / Uberjars
Shading is the notion of packing everything into a single giant jar; not jars-in-jars, but unpack the contents of your dependency jars into the main app jar. This can be quite slow in the build (if you have a few hundred MB worth of deps, those need to be packed in and all class files need analysis for the shade rewrite, that's a lot of bits to process, so it always takes some time). The general idea behind shading is that deployment 'is as simple as transferring one jar file', but this is not actually practical, given that you can no longer assume that end users have a JVM installed, and even if they do, you cannot rely on it being properly up to date. I mention it here because you may hear this from others, but I wouldn't recommend it.
If you really do want to go for this, the only option is build systems: They have a plugin to do it; there is no command line tool that ships with java itself that can do this. There are also caveats about so-called 'signed jars' which cannot just be unpacked into a single uberjar.
App container
Not all java apps are standalone where you provide the main. If you're writing a web service, for example, you have no main at all; the framework does. Instead of a single entrypoint ('main' - the place where your code initially begins execution), web services have tons of entrypoints: One for every URL you want to respond to. The framework takes care of invoking them, and usually these frameworks have their own documentation and specs for how dependencies are loaded. Usually it is a matter of putting a jar in one place and its dependencies in a subdir named 'lib', or you build a so-called war file, but, really, so many web frameworks and so many options on how they do this. The good news is, usually its simple and the tutorial of said framework will cover it.
This advice applies to any 'app container' system; those are usually web frameworks, but there are non-web related frameworks that take care of launching your app.
Don't do these
Don't force your users to manually supply the -classpath option or mess with the CLASSPATH environment variable.
Don't try to write a custom classloader that loads jars-in-jars.
NB: Sqlite2 is rather complicated for java; it's not getting you many of the benefits that the 'lite' is supposed to bring you, as it is a native dependency. The simple, works everywhere solution in the java sphere is 'h2', which is written in all java, thus shipping the entire h2 engine as part of the java app is possible with zero native components.

Find out which Java classes are actually loaded and reduce jar

Is there a way to automatically find out which Java classes are actually loaded (either during compile time, as far as that's possible, or during the runtime of an application), and to throw out all other classes from a JAR to create a smaller JAR? Does that actually make sense in practice?
I am talking about the application classes for an application JAR. Usually there are lots of libraries in an application, and an application rarely needs all features of those libraries. So I suspect that would make a considerably smaller application. In theory that might be done for example via an Java agent that logs which classes and resources are read by one or several runs of an application (or even just by java -verbose:class), and a maven plugin that throws out all other classes from a jar-with-dependencies. Is there already something like that?
Clarification: I am not talking about unused dependencies (JARs that are not used at all), but about removing unused parts of each included JAR.
Well, the Maven Shade Plugin has an option minimizeJar when creating an Uber-JAR for your application:
https://maven.apache.org/plugins/maven-shade-plugin/
But, as others already pointed out, this is quite dangerous, as it regularly fails to detect class accesses which are done via Reflection or other dynamic references.
It may not be a good approach automate, as application can use reflection to initialise objects or one JAR is dependent on another JAR.
Only way that I can think of is to remove each JARs one by one and check if application runs as expected. Then again in this approach all modules of the application has to be tested, since one module can work without particular dependency and other may not.
Better solution is to take care while developing. The application developer must be careful in adding a dependency and removing unwanted dependency after his/her piece of code is done.
Global strategy.
1) Find all the classes that are loaded during runtime.
2) List of all the classes available in the classpath.
3) Reduce your class path by creating copies of jars containing only classes you need.
I have done 1 and 2 part so I can help you.
1) Find out all the classes that are loaded. You need 100 % code coverage (I am not talking about tests, but production). So run all possible scenarios, so all the classes your app needs will be loaded and logged.
To log loaded classes try several approaches. Reflection, –verbose:class flag, also you can learn about java agent. It allows to modify methods during runtime. This is an example of some java agent code or another java agent example
2) To find all the classes available in jar, you can write a program. You need to know all places where application jars are placed. Loop throw these jars (You can use ZipFile), loop through ZipFileEntry entries, and collect all classes.
3) After that write a script or program that reassembles your application. For example, now you can create a new jar file for each library and put there only needed classes.
Also you may use a tool (again, you are a programmer, so write a program), which checks code for classes dependence. You do not want to remove classes if they are used for compilation. When I was a student, I wrote code alanyzer, which builds an oriented graph for classes dependencies.
As #Gokul Nath KP notes, I did this before. I manually change gradle and maven dependencies, removing one by one, and then full regression test. It took me a week (our application was small comparing to modern world enterprise systems created by hundreds of developers).
So, be creative, and in case of success, your project will be used by millions!

Java: Libraries within Libraries and Classpath Issues

We are having a discussion at work and an interesting point came up:
Say you are developing a small library, call it somelib. Say that somelib needs to do some logging, but you don't want to reinvent the wheel, so you decide to use a 3rd party logging library.
Additionally, you want to make integration of somelib as painless as possible, so you distribute a single JAR file (somelib.jar), which has the other logging JAR, call it logger.jar, embedded inside of it. Much like what Maven's jar-with-dependencies assembly does.
Now comes the issue. Since your product is a library, what if your customer is using somelib and also happen to be using a different version of the same logging library on their own. Now we have a classpath problem.
This seems to me like it would be a common problem for people that write libraries, so what is the typical solution?
Do they avoid using JAR bundling methods altogether? Even if we do that, there is still an issue with a user's code expecting version X of the logging library, and somelib's code expecting version Y.
Do they somehow insert a dummy package prefix so that the logger classes in somelib won't conflict?
What about dynamic loading of the logger library? (though this still has versioning problems from 1.)
You may consider to use OSGI or wait for JDK 8 and its Jigsaw project.

Hibernate: is it possible to reduce file size of jar?

I am working on a desktop application, I use Hibernate and HSQLDB. When I make my application a runnable jar file, it has a bigger fize size than I think. I see that the biggest part is from Hibernate and its dependencies. I am not sure if I need all of the Hibernate features. Is there a way to get rid of the parts of Hibernate and its dependency libraries which I don't use?
Under the /lib/ folder in Hibernate zip you will see a folder called /required/. For very basic Hibernate apps thats all you will need though you may need additional JARs for things such as JPA. I would start by only including the JARs in the lib/required/ directory, see if your project works, and if it doesn't add what you need to get your project working again.
perhaps you could use a tool to analyse your classes and dependencies (for e.g. http://www.dependency-analyzer.org/). Here is another post about it: How do I find out what jar files are actually used when compiling a java project.
the other way is to remove some jars (or even single class files) and try whether your application is still working or not. but i think this is not a very good way...
I can't think of a better tool for this than ProGuard.
ProGuard is a free Java class file shrinker, optimizer, obfuscator, and preverifier. It detects and removes unused classes, fields, methods, and attributes. It optimizes bytecode and removes unused instructions. It renames the remaining classes, fields, and methods using short meaningless names. Finally, it preverifies the processed code for Java 6 or for Java Micro Edition.

Java - How to reduce the size of third-party jars to reduce the size of your application

I am developing a java web application and that includes an applet. That applet is
dependent on two jar files:
JFreeChart (for plotting graphs at the client side) - 1.7 mb(size of
jar file)
MySqlJdbcConnector (for storing data, captured at the client side, to
a remote database) - .7 mb (size of
jar file)
Now, the problem is the size of above
two jar files. The total size of my
applet jar (myApplet.jar) is 2.5
mb out of which 2.4 mb is
because of the above two jar files.
I am not using all the classes in
those jar files. Specifically, for
jfreechart, I am using a very small number of classes from that
library.
Questions
Q1. For creating myApplet.jar file, what I have done is I have unzipped both of the jar files (jfreechart and mySQLJdbcConnector) and then packed the unzipped version of the jar files with the source code of my applet code to create one single jar file (i.e myApplet.jar). Is it the correct way of packing third party jar files with your applet code? Is there any way by which I can optimize this?
Q2. I tried to find the dependencies of the classes of jfreechart library which I am using in my application so as to pack only those dependencies in myApplet.jar. For that purpose, I used DependencyAnalyzer to find the dependencies of all the classes. But later I found it difficult to do so manually because every class (class of jfreechart that I am using in my application) has lot of dependencies and I am using some 15 classes of jfreechart so doing this for every class will be very difficult. So any suggestion on this?
Q3. Is this situation very common that developers encounter or I am missing something because of which I have to do this?
I'd suggest trying out ProGuard. You can exclude parts of jar files you're not using.
Yes you can save space by creating a JAR containing only the classes that your applet requires. (I've seen this called an uber-JAR.)
There are various tools for doing this; e.g. ProGuard, Zelix ClassMaster, a Maven plugin whose name I forget and so on.
There are however a couple of issues, at least in the general case:
If your code uses dynamic loading (e.g. by calling Class.forName(className)), these tools generally cannot detect the dependency. So to avoid dynamically loaded classes being left out of the final JAR, you need to tell the tool the names of all of all classes that your application might explicitly load that way.
You need to take a look at the license of the third party library. IIRC, some licenses require you to include the library in your distributed artifacts in a way that allows people to substitute a different version of the library. One could argue that an uber-JAR makes this hard to do, and therefore could be problematic.
JFreeChart is LGPL, and LGPL is a license that has the requirement above. However MySQL is GPL, which trumps LGPL, and this means that your applet must be GPL'ed ... if you distribute it.
Finally, if you want to minimize the size of your applet JAR, you should NOT include your source code in the JAR. Source code should be in a separate JAR (or ZIP, TAR or whatever) file.
A1:
You can create an ant script or use Eclipse or any other IDE to automatically package your applet. But your way is correct, too
A2:
I wouldn't do these things manually. Finding transitive dependencies is very complex. Maybe darioo's answer is a better way to do this.
A3:
This is very common indeed. A couple of hints:
You can always re-build those third party libraries without debug information. That should slightly decrease the size of those libraries.
On the other hand, maybe you shouldn't have a direct connection from your applet to a database. You could create an RMI interface (or something similar) to transfer your SQL and result data to an application server, which actually executes your SQL. This is an important security aspect for your applet, if you don't run this in a safe intranet.

Categories

Resources