How would you determine the classes (non Sun JDK classes) loaded / unused by a Java application?
Background:
I have an legacy Java webstart application that has gone through a lot of code changes and now has a lot of classes, most of which are not used. I would like to reduce the download size of the application by only deploying classes that will be used only instead of jaring the all the packages.
I will also use the same process to completely delete these unused classes.
Use java -verbose:class to see what classes are loaded, then use grep (or any other tool) to keep only the lines from your packages.
A small limitation: it will only tell you which classes are really used when they are used, so you must cover all use cases of your application.
You can use a good IDE for that.
For instance Intellij IDEA which analyzes the source code for dependencies and allows you to safely delete a class/method/attribute is is not being used by any other.
That way you can get rid off all your dead code.
Related
Is there a way to automatically find out which Java classes are actually loaded (either during compile time, as far as that's possible, or during the runtime of an application), and to throw out all other classes from a JAR to create a smaller JAR? Does that actually make sense in practice?
I am talking about the application classes for an application JAR. Usually there are lots of libraries in an application, and an application rarely needs all features of those libraries. So I suspect that would make a considerably smaller application. In theory that might be done for example via an Java agent that logs which classes and resources are read by one or several runs of an application (or even just by java -verbose:class), and a maven plugin that throws out all other classes from a jar-with-dependencies. Is there already something like that?
Clarification: I am not talking about unused dependencies (JARs that are not used at all), but about removing unused parts of each included JAR.
Well, the Maven Shade Plugin has an option minimizeJar when creating an Uber-JAR for your application:
https://maven.apache.org/plugins/maven-shade-plugin/
But, as others already pointed out, this is quite dangerous, as it regularly fails to detect class accesses which are done via Reflection or other dynamic references.
It may not be a good approach automate, as application can use reflection to initialise objects or one JAR is dependent on another JAR.
Only way that I can think of is to remove each JARs one by one and check if application runs as expected. Then again in this approach all modules of the application has to be tested, since one module can work without particular dependency and other may not.
Better solution is to take care while developing. The application developer must be careful in adding a dependency and removing unwanted dependency after his/her piece of code is done.
Global strategy.
1) Find all the classes that are loaded during runtime.
2) List of all the classes available in the classpath.
3) Reduce your class path by creating copies of jars containing only classes you need.
I have done 1 and 2 part so I can help you.
1) Find out all the classes that are loaded. You need 100 % code coverage (I am not talking about tests, but production). So run all possible scenarios, so all the classes your app needs will be loaded and logged.
To log loaded classes try several approaches. Reflection, –verbose:class flag, also you can learn about java agent. It allows to modify methods during runtime. This is an example of some java agent code or another java agent example
2) To find all the classes available in jar, you can write a program. You need to know all places where application jars are placed. Loop throw these jars (You can use ZipFile), loop through ZipFileEntry entries, and collect all classes.
3) After that write a script or program that reassembles your application. For example, now you can create a new jar file for each library and put there only needed classes.
Also you may use a tool (again, you are a programmer, so write a program), which checks code for classes dependence. You do not want to remove classes if they are used for compilation. When I was a student, I wrote code alanyzer, which builds an oriented graph for classes dependencies.
As #Gokul Nath KP notes, I did this before. I manually change gradle and maven dependencies, removing one by one, and then full regression test. It took me a week (our application was small comparing to modern world enterprise systems created by hundreds of developers).
So, be creative, and in case of success, your project will be used by millions!
After finishing my project, I want to remove all the unused classes to reduce the size of jar file when packaging.
I am using IntelliJ, it can help me detect unused classed but it includes some classes are only called by reflection (runtime only). Moreover, it cannot detect unused classes in external libraries.
One important thing, I want to remove unused classed in external libraries. Example, when I use BiMap from Google Guava, I have to include Guava lib, but I just want to use only BiMap, including whole Guava makes my jar getting big
So, I thinked reversely, instead of finding unused classes, I want to know all the classes is used/called when run (I will remove unused classed/packages manually). How can I do that?
Consider using a tool like Proguard (http://proguard.sourceforge.net/) to do this
I am unsure how you can limit the contents of the jar file to only the referenced Java classes. You may also run into issues when a class is loaded dynamically.
Guava explains on their site how you can include a subset of Guava in your build, by using ProGuard: https://github.com/google/guava/wiki/UsingProGuardWithGuava
I am using soot to instrument classes of an application. But I've found to way to instrument classes dynamically with it. Soot only detect static links which would cause failures with programs with dynamic loading. So I have to detect what classes are dynamically loaded in a program. Suppose I don't have the option to instrument all classes for practical reasons. For example, I have to instrument the whole JDK that could take hours. Because there is the possibility that a JDK class is loaded at run time.
My ultimate goal from this tool/method is to give me the complete name of all classes that a program uses.
People usually use TamiFlex in combination with Soot for such issues:
https://code.google.com/p/tamiflex/
TamiFlex lets you record dynamic loading with very little overhead.
I am working on a desktop application, I use Hibernate and HSQLDB. When I make my application a runnable jar file, it has a bigger fize size than I think. I see that the biggest part is from Hibernate and its dependencies. I am not sure if I need all of the Hibernate features. Is there a way to get rid of the parts of Hibernate and its dependency libraries which I don't use?
Under the /lib/ folder in Hibernate zip you will see a folder called /required/. For very basic Hibernate apps thats all you will need though you may need additional JARs for things such as JPA. I would start by only including the JARs in the lib/required/ directory, see if your project works, and if it doesn't add what you need to get your project working again.
perhaps you could use a tool to analyse your classes and dependencies (for e.g. http://www.dependency-analyzer.org/). Here is another post about it: How do I find out what jar files are actually used when compiling a java project.
the other way is to remove some jars (or even single class files) and try whether your application is still working or not. but i think this is not a very good way...
I can't think of a better tool for this than ProGuard.
ProGuard is a free Java class file shrinker, optimizer, obfuscator, and preverifier. It detects and removes unused classes, fields, methods, and attributes. It optimizes bytecode and removes unused instructions. It renames the remaining classes, fields, and methods using short meaningless names. Finally, it preverifies the processed code for Java 6 or for Java Micro Edition.
Is there a way to check if all boot (core) java classes (belonging to the Java Runtime Environment) have been loaded/initialized for use in Java?
I need to check this in a rare situation where I have have access to the JRE but not to the actual application, so I cannot simply wait for the main application to run and execute from that point on.
The JVM will load classes on an "as-needed" basis, so there's no one point at which "all" of the classes on the bootstrap classpath will have been loaded.
That said, from 1.5 and onward, the Sun JVMs use "class data sharing" to pre-load a specific set of classes. I don't know which classes get loaded, but would suspect it's limited to those in the java.lang package.
If you simply want to keep track of when classes get loaded, use the -verbose:class command-line option when starting the JVM.
having read your comments (and accidentally deleted my cookie), all I can say is that JVMTI is pretty much guaranteed to be a much better choice for whatever it is that you're trying to do.
But if you're hell-bent on modifying a JRE class, why not simply add a static boolean variable that will get set during FileWriter initialization?