How to shrink external java libs? - java

I'm writing an applet, which uses ~10 external libraries. Together they occupy more than 2 megabytes. In some libs we use only 1-2 classes, so a lot of others can be safely deleted. So the question is how to remove unused classes from jar libraries?
A lot of other questions link to Proguard. But it doesn't process libraries (or I am doing something wrong) and also ruins parts of code which use reflection.

You could use the maven-shade-plugin and tell it to build a minimized jar file that combines your code and libs.

You could use something like ClassDep, which statically identifies which classes you will use.
However it's possible to easily fool this. Imagine some of your code contains:
Class.forName(className);
so you can dynamically build a classname and load that class. Tools like ClassDep can't identify these cases, so you'd need to perform comprehensive testing on your shrunken jars.

ProGuard can process your code together with the libraries (with the option -injars). You can still keep external libraries that you don't want to process (with the option -libraryjars).
Any automatic shrinking tool will have problems with reflection. ProGuard recognizes some basic reflection and it allows you to specify the parts of the internal API that should be preserved for the sake of reflection. ProGuard supports some powerful configuration, but depending on the amount of reflection in the libraries, it may still require trial and error.

You can simply "unzip" the JAR's, take only the classes you want from each, and place them in a custom archive. Brian A. gave a good suggestion on how to identify those classes and some caveats. I would add they you may be violating licenses as well...

Related

Soot Java bytecode framework: How to compile a single class file to Jimple/Shimple

I'm trying to figure out how to use Soot in an existing project (a metacircular interpreter). Specifically, I want to use Soot to convert java bytecode into a convenient 3-address code (either Jimple or Shimple) that I can interpret. I may want to do more things later, but for now I just want the conversion.
What's the best way to perform this translation? Soot seems like a ginormous project which as tons of functionality, but I really only need a single method
compileClass: Byte[] -> ShimpleClass
Preferably as pure as possible (i.e. no setup/teardown required, everything packaged within that method). I've spent hours going over the javadoc/papers/presentations, but most of them seem focused on usage as a command line tool or an eclipse plugin. Could anyone give me some pointers as to where to start?
This is probably easiest answered on the Soot mailing list.
Soot is set up to load .class files from the file system but it should not be that hard to instruct it to load something from a ByteArrayInputStream as well. I guess that should help you in your case.

Eclipse plugin for jar dependency detection and possible safe removal

I have a series of eclipse projects that use a bunch of third party jars.
So many are included of different versions.
But I have noticed that some of these libraries, due to code changes over the time, are not used any more but the reference to the library is there.
Is there any plugin that shows the jar dependencies of each project and which I can remove safely?
JarAnalyzer can be used for this purpose.
I am not aware of any plugin or tool that helps in doing what you want to do. However, there may be rules or procedures that help to reach the final end: a reduced set of libraries that is needed and consistent.
I have found the the "Java API Compliance Checker" which allows to compare two versions of the same library. May help to reduce the number of the used libraries for the same purpose. I have not used it, so I cannot tell you about my experience.
Define if it is allowed to have the same kind of library in different version available. Depending on the environment, this may or may not allowed.
Incremental process to reduce the amount of libraries needed:
Remove one library each time from eclipse.
Look if compile errors result from that.
If yes resolve the compile errors.
When all are resolved, start your unit tests (you have unit tests, of course :-)) and see if any unit test breaks.
Do these steps for each library you want to remove.
At the end it could be worthwhile to look at a tool like ivy that allows you to manage the libraries explicitly. Or even switch to Maven which allows you the same.
Final remark: The usage of a library should be
decided by the architect of an application only and
documented in the architecture handbook together with the reasons for doing that.
Try open your Manifest file. You can edit and remove the dependency from there

Need dependency tree for a Java source file

I need to extract some specific functionality from a large legacy Java codebase, in order to turn it into a standlone command-line application. This code is not documented at all and is not very modular or even clear. So I'm having a really hard time figuring out what I need to keep.
Basically what I need is a a dependency tree, listing all the direct or indirect dependencies of this one *.java file. (Preferably I would like this listing to be in a format that I can save to a text file, as opposed to some un-copy-able whiz-bang GUI tree with a bazillion collapsed nodes...)
I'm using Eclipse for this detective work. I am an Eclipse beginner, but I figure that there may be Eclipse tricks/tools to perform this kind of operation with a bit less effort.
Any suggestions (using Eclipse or otherwise) would be appreciated.
There's a free version of eUML2: http://www.soyatec.com/euml2/features/eDepend/, one of its features is exactly what you need. Also another one, i'm not sure if eUML can export any text files.
Here is a kind of detailed guide installing eUML2.
I've used Dependency Finder for this kind of work recently and it works well.
You can make use of the Java doc generation functionality to be able to generate a java doc that in this case will not contain much information about the methods but will give you an idea of which classes extend which classes, interfaces and such, resulting in a sort of a dependency tree.

Getting help on MATLAB's com.mathworks internals

It is possible to access bits of MATLAB's internal java code to programmatically change MATLAB itself. For example, you can programmatically open a document in the editor using
editorServices = com.mathworks.mlservices.MLEditorServices;
editorServices.newDocument() %older versions of MATLAB seem to use new()
You can see the method signatures (but not what they do) using methodsview.
methodsview(com.mathworks.mlservices.MLEditorServices)
I have a few related questions about using these Java methods.
Firstly, is there any documentation on these things (either from the Mathworks or otherwise)?
Secondly, how do you find out what methods are available? The ones I've come across appear to be contained in JAR files in matlabroot\java\jar, but I'm not sure what the best way to inspect a JAR file is.
Thirdly, are there functions for inspecting the classes, other than methodsview?
Finally, are there any really useful methods that anyone has found?
There is no official documentation nor support for these classes. Moreover, these classes and internal methods represent internal implementation that may change without notice in any future Matlab release. This said, you can use my uiinspect and checkClass utilities to investigate the internal methods, properties and static fields. These utilities use Java reflection to do their job, something which is also done by the built-in methodsview function (I believe my utilities are far more powerful, though). In this respect, I believe we are not crossing the line of reverse-engineering which may violate Matlab's license.
If you are looking for documentation, then my UndocumentedMatlab.com website has plenty of relevant resources, and more is added on a regular basis so keep tuned.
I am also working on a book that will present a very detailed overview of all these internal classes, among other undocumented stuff - I hope to have publication news later this year.
I am an eclipse fan. If you use that as your IDE, the jar can be imported into one of your projects and you can inspect the methods in there.
To find out more about java objects, I use uiinspect.
The only place I know that is documenting the Matlab hidden Java stuff is Undocumented Matlab by Yair Altman. His site lists plenty of very useful tricks. Being able to use Java to format text in list boxes has come in very handy for me, for example.
EDIT
The man has spoken. Listen to him, since I don't think there's anyone outside MathWorks who knows more about Matlab's internal java code.
Undocumented Matlab is a great place to start looking.

How does one weed out dependencies in a large project?

I'm about to inherit a rather large Java enterprise project that has a large amount of third party dependencies. There is at least seventy JARs included and some of them would seem to be unused e.g. spring.jar which I know isn't used.
It seems that over the years as various developers have touched upon the code base they have all tried out new project-of-the-month type libraries.
How does one go about getting rid of these? Within reason of course, as clearly some dependencies are helpful to not have to re-invent the wheel.
I'm obviously interested in java based projects but I'm welcome to answers across languages that people think will be helpful.
Personally, I think you have to start by assessing the scale of the problem. It's going to be fairly painful, but I'd make a list of the dependencies and work out exactly which parts of the project use which ones.
Then I'd work out exactly what features of each you're actually making use of (in many cases, you'll end up having a massive third party library which you're using a tiny part of).
Once you have this information, you'll at least know what you're dealing with.
My next step would be to look at all of the dependencies that you only use to a small extent. Checking around might uncover things that you could use from other libraries that would eliminate the lesser used libraries.
I'd also have a look around to see if there's anything small that you could just re-write and include in your own code-base.
Finally, I'd have a look around at the vendors of your dependencies and their competitors to see if the latest versions contain more functionality that will allow you to eliminate a few others.
Then you're just left wondering whether it's better to be highly dependent on a few vendors, or less dependent on a lot of vendors!! ;o)
structure101 http://www.headwaysoftware.com/products/structure101/index.php
It's a great tool for showing dependencies. I've been using it for a couple of years.
If you have a good set of automated tests, and you're looking to remove libraries which are not used at all, you could just use trial and error. One at a time, remove a library, and run your tests to see if everything still works. If not, put it back. Of course, if you can't even build without a library, you probably need it.
Basically, however you go about it, my idea is to remove them one at a time and see what breaks. If nothing breaks, odds are good you can just toss the library. If the problem is very minor (e.g. you need one method of one class in a large library), you might be able to code around it.
If you're dealing with a standalone application, you could give the JVM the -verbose:class option to see which classes are being loaded. This should give you messages like:
[Opened C:\Program Files\Java\jre1.6.0_04\lib\rt.jar]
[Loaded java.util.regex.Pattern$Single from C:\Program Files\Java\jre1.6.0_04\lib\rt.jar]
I read about an approach using instrumentation here, never tried it, but sounds reasonable.
We went through an exercise like this, on a delphi codebase. We dramatically simplified our external dependancies. Basically, we went about it like this:
Catalogued all external libraries and components
Catalogued (using a file search tool) where they were used, and what for.
Removed everything we didn't use or didn't need (some libraries were used in code that was no longer needed).
Made a ranking of which libraries we favored, basing this on whether the library was actively developed, how much functionality it offered that we used, how difficult it was to port the code that used it to another library that we already used and so on.
Finally, we iteratively removed dependancies on libraries low on the list by porting that functionality to another library.
This was, however, quite a lot of work.
If you take the approach of "remove things until it won't compile" you need to be very careful about transitive runtime dependencies. If there's a good quality test suite, it can help, but you'll certainly need to run a test coverage tool like Cobertura to make sure that enough of the code is getting tested to exercise your full dependency graph.
How much code are you talking about? The review-based approach suggested by Joeri frankly seems the best to me; it has the added advantage of making you at least superficially familiar with all parts of the system. If you're just inheriting a big project, this is something you should probably take the time to do anyway.
if you have a full regression test suite for this project, all you have to do is run the regression suite while running with 1 less JAR each time in a loop. it is NOT fast BUT it is easy to do.

Categories

Resources