I have a Tomcat-powered webapp that builds to a war and is deployed. It's been used for a few somewhat different tasks over the years, and it has lots and lots and lots of classes and libraries.
I'd like to do some sort of automated census of used and unused classes (and maybe even dependencies) and get a report back for which classes, methods, or even lines that have not been executed over a few days of production use.
Is there a tool that could generate such a report for me?
You're looking for a code coverage tool.
For Java, try EMMA:
http://emma.sourceforge.net/
If you are talking about statistics of unused code (functionally) in production system you can start with simply enabling the "-verbose:class" as startup parameter. I don't think Sun JDK (at least JDK 5)supports regular expression to restrict the log to specific package(s).
It's better to analyze the unused method/block using static analysis tools like PMD/Sonar rather than instrumenting to method/line level.
Related
Can anyone confirm that Checkstyle is meant to be run with the compiled versions of classes on the classpath?
We currently run it on the Java files alone but recently we've been encountering some errors around the "RedundantThrows" and "JavadocMethod" checks. The error is "Unable to find class information for X". Searching online we've found that the solution is to add the compiled classes to the classpath before running Checkstyle.
Our problem is that our Checkstyle audit currently runs on a server that only has access to the source and we just want to confirm that Checkstyle will in fact need access to compiled classes. Can't seem to find "definitive proof" on the official site.
Checkstyle is perfectly happy with the source files only. Compiled versions of your classes are not required.
However, it is still better to have compiled classes available, because a few individual checks do make use of compiled .class files. These checks mention the fact that they need binaries in their documentation. One is the JavadocMethod check you mention. This one will still function without binaries, but you may see some irritation in the logs.
The other check I can think of needing compiled classes is RedundantThrows. This one will probably not do much good with only sources. You'd have to give it a try.
In both cases, you can suppress the load errors by setting the suppressLoadErrors property to true. Without binaries, the check will not be able to gather inheritance information. So some features of the check will be limited, but it will otherwise work fine or at least not bother you.
In my case there are two reason for doing that:
Sometimes people by mistake import classes which present in macbooks JDKs but absent in Linux. That causes build to fail on ci servers which are Linux based boxes. I doesn’t happen frequently, but when it does happened I'm thinking that there should be some smarter way to find out that earlier.
Unused imports trigger warning in IDE/code analysis. From time to time somebody need to spend time on cleaning up this stuff. Even if its just single right click in IDE you still need to commit your changes and make sure everything alright on build.
I'm curious if there is any way to find unused imports programmatically (lets say from unit test) and fail locally if there are any.
Maybe failing a build because of unused import sounds harsh, but if it saves time for team overall it makes sens to do so (would love to hear opinion on that as well).
UPDATE:
I followed yegor256 suggestion and incorporated Checkstyle task with initially small subset of Sun Code Conventions (unused imports is one of them) and made it break a build if violations found.
After one week of trial we've got zero unused imports in our codebase and surprisingly zero complaints about this rule (by the way, Checkstyle is really fast: analyzing ~100KLoc taking less than one second).
As for using IDE for such analysis: yes, it good choice, but having this kind of checks run as part of automated build is better.
What you're trying to do is called static code analysis. Checkstyle can help you. If you're using Maven, this plugin will do the automation for you: http://maven.apache.org/plugins/maven-checkstyle-plugin/
You can also take a look at qulice.com (I'm one of its developers), which integrates a few static analysis tools and pre-configures them (incl. Checkstyle, PMD, FindBugs).
If you are using eclipse IDE or IntelliJ IDEA, you can configure them to
1a. organize imports / remove unused imports on save or before commit (see cleanup preferences)
1b. switch the "unused imports" warning to an error (see error settings)
2a. configure a jre which does not include com.* stuff
2b. configure the warning of proprietary api usage from the jre to be an error
You might still want to check that on the build server, though. In this case the more complicated stuff like configuring CheckStyle would still be necessary.
I'm curious if there is any way to find unused imports programmatically (lets say from unit test) and fail build locally if there are any.
I use IntelliJ to organise imports, this removes all the unused imports. You can do this with one hot key from the top of you code base to correct all the imports. (It also has over 700 other types of static checks and fixes)
Maybe failing a build because of unused import sounds harsh, but if it saves time for team overall it makes sens to do so (would love to hear opinion on that as well).
I have IntelliJ check in code which formatted and with imports organised so the issue never arises in the first place. ;)
In Computer Science the name given to such a process of analyzing the code without executing is known as static code analysis.
Try using an IDE, I am using Eclipse, which marks all the Unused imports and Unused Variables or methods in with a Yellow color underline....
Aren't these unrelated questions? If you import classes only present in the local JDK, these imports are used (just unsatisfied). For either problem, I recommend solving it in the IDE so the problem will be detected when code is written, rather than prior to checkin (the earlier the detection, the easier the fix ...).
In eclipse, you could prevent unsatisfied imports with access rules, and automatically fix imports whenever a source file is saved by enabling the appropriate save action. If you check these settings into version control, you can easily share them with the team.
I see lot of comments in same way that use this IDE or that IDE. But all my friends try to understand the difference. Doing something programmatically is different and using IDE is different.
If I want a process to be programmatic then suggestion of IDE is not useful. It might be possible some one is asking this question because he is building complete process and this step is part of it. How opening IDE would help him on different machines and OS where CI is working?
I too building one tool on similar lines. I achieved it up to some level but it programmatically open IDE and close it automatically and fixes source code too. But opening same in Linux might be a question for me.
Understanding some one's view before answering is really very important.
I'm trying to figure out which tool to use for getting code-coverage information for projects that are running in kind of stabilization environment.
The projects are deployed as a war and running on Jboss. I need server-side coverage while running manual / automated tests interacting with a running server.
Lets assume I cannot change projects' build and therefore cannot add any kind of instrumentation to their jars as part of the build process. I also don't have access to code.
I've made some reading on various tools and they are all presenting techniques involving instrumenting the jars on build (BTW - doesn't that affect production, or two kinds of outputs are generated?)
One tool though, JaCoCo, mentioned "on-the-fly-instrumentation" feature. Can someone explain what does it mean? Can this help me with my limitations?
I've also heard on code-coverage using runtime profiling techniques - can someone help on that issue?
Thanks,
Ben
AFAIK "on-the-fly-instrumentation" means that the coveragetool hooks into the Classloading-Mechanism by using a special ClassLoader and edits the Class-Bytecode when it's being loaded.
The result should be the same as in "offline-instrumentation" with the JARs.
Have also a look at EMMA, which supports both mechanisms. There's also a Plugin for Eclipse.
A possible solution to this problem without actual code instrumentation is to use a jvm c-agent. It is possible to attach agents to the jvm. In such an agent you can intercept every method call done in your java code without changes to the bytecodes.
At every intercepted method call you then write info about the method call which can be evaluated later for code coverage purposes.
Here you'l find the official guide to the JVMTI JVMTI which defines how jvm agents can be written.
You don't need to change the build or even have access to the code to instrument the classes. Just instrument the classes found in the delivered jar, re-jar them and redeploy the application with the instrumented jars.
Cobertura even has an ant task that does that for you: it takes a war file, instrument the classes inside the jars inside the war, and rebuild a new war file. See https://github.com/cobertura/cobertura/wiki/Ant-Task-Reference
To answer your question about instrumenting the jars on build: yes, of course, the instrumented classes are not used in production. They're only used for the tests.
We have huge codebase and some classes are often used via reflection all over the code. We can safely remove classes and compiler is happy, but some of them are used dynamically using reflection so I can't locate them otherwise than searching strings ...
Is there some reflection explorer for Java code?
No simple tool to do this. However you can use code coverage instead. What this does is give you a report of all the line of code executed. This can be even more useful in either improving test code or removing dead code.
Reflections is by definition very dynamic and you have to run the right code to see what it would do. i.e. you have to have reasonable tests. You can add logging to everything Reflection does if you can access this code, or perhaps you can use instrumentation of these libraries (or change them directly)
I suggest, using appropriately licensed source for your JRE, modifying the reflection classes to log when classes are used by reflection (use a map/WeakHashMap to ignore duplicates). Your modified system classes can replace those in rt.jar with -Xbootclasspath/p: on the command line (on Oracle "Sun" JRE, others will presumably have something similar). Run your program and tests and see what comes up.
(Possibly you might have to hack around issues with class loading order in the system classes.)
I doubt any such utility is readily available, but I could be wrong.
This is quite complex, considering that dynamically loaded classes (via reflection) can themselves load other classes dynamically and that the names of loaded classes may come from variables or some runtime input.
Your codebase probably does neither of these. If this a one time effort searching strings might be a good option. Or you look for calls to reflection methods.
As the other posters have mentioned, this cannot be done with static analysis due to the dynamic nature of Reflection. If you are using Eclipse, you might find this coverage tool to be useful, and it's very easy to work with. It's called EclEmma
How would you determine the classes (non Sun JDK classes) loaded / unused by a Java application?
Background:
I have an legacy Java webstart application that has gone through a lot of code changes and now has a lot of classes, most of which are not used. I would like to reduce the download size of the application by only deploying classes that will be used only instead of jaring the all the packages.
I will also use the same process to completely delete these unused classes.
Use java -verbose:class to see what classes are loaded, then use grep (or any other tool) to keep only the lines from your packages.
A small limitation: it will only tell you which classes are really used when they are used, so you must cover all use cases of your application.
You can use a good IDE for that.
For instance Intellij IDEA which analyzes the source code for dependencies and allows you to safely delete a class/method/attribute is is not being used by any other.
That way you can get rid off all your dead code.