Ensure minimal coverage on new Subversion commits - java

We have a massive project with almost no unit tests at all. I would like to ensure from now on that the developers commit new features (or bugs!) without minimal coverage for corresponding unit tests.
What are some ways to enforce this?
We use many tools, so perhaps I can use a plugin (jira, greenhopper, fisheye, sonar, hudson). I was also thinking perhaps a Subversion pre-commit hook, the Commit Acceptance Plugin for jira, or something equivalent.
Thoughts?

Sonar (wonderful tool by the way) with Build breaker plugin can break your Hudson build when some metrics don't meet specified rules. You can setup such a rule in Sonar that would trigger an alert (eventually causing the build to fail) when the coverage is below given point. The only drawback is that you probably want the coverage to grow, so you must remember to increase the alert level every day to the current value.

What you want to do is determine what is new code, and verify that the new code is covered by some test.
Determining code coverage in general can be accomplished with any of a variety of test coverage tools. Many test coverage tools can simply reinstrument your entire application and then you can run tests to determine coverage.
Our (Semantic Designs') line of Test Coverage tools can determine, from a changed-file list, just the individual files that need to re-instrumented, and with careful test organization, just the tests that need to be reexecuted. This will minimize the cost of re-running your tests, and you'll still end
with the same overall coverage data. (Actually, these tools detect what tests need to be made based on changes at the method level).
Once you have test coverage data, what you want to know is the the specifically new code is covered by some tests. You can do this sloppily with just test coverage data if you know which files changed, by insisting the changed files have 100% coverage. That probably doesn't work in practice.
You could instead take advantage of SD's Smart Differencer tools to give a more precise answer. These tools compare two language files, and indicate where the changes are using the language syntax (e.g., expression, statement, declaration, method body, not just changed source lines) and conceptual editing operations (move, copy, delete, insert, rename-identifier-within-block). SmartDifferencer deltas tend to be both smaller and finer than what you would get from a plain diff tool.
It is easy to extract from the SmartDifferencer's output a list of lines changed. One could compute the intersection of that, per file, with the lines covered by the test coverage data. If the changed-lines are not all entirely within the set of covered lines, then "new" code hasn't been tested and you can raise a flag, stop a checkin, or whatever to signal that your checking policy has been violated.
The TestCoverage and SmartDifferencer tools don't come out-of-the-box with this computation done for you, but it should be a pretty easy script to implement.

if you use maven - cobertura plugin can be a good choice ( and not so annoying for developers as svn hook )
http://mojo.codehaus.org/cobertura-maven-plugin/usage.html

Related

JUnit report to show test functionality, not coverage

One of the problems of a team lead is that people on the team (sometimes even including myself) often create JUnit tests without any testing functionality.
It's easily done since the developers use their JUnit test as a harness to launch the part of the application they are coding, and then either deliberately or forgetfully just check it in without any assert tests or mock verifies.
Then later it gets forgotten that the tests are incomplete, yet they pass and produce great code coverage. Running up the application and feeding data through it will create high code coverage stats from Cobertura or Jacoco and yet nothing is tested except its ability to run without blowing up - and I've even seen that worked-around with big try-catch blocks in the test.
Is there a reporting tool out there which will test the tests, so that I don't need to review the test code so often?
I was temporarily excited to find Jester which tests the tests by changing the code under test (e.g. an if clause) and re-running it to see if it breaks the test.
However this isn't something you could set up to run on a CI server - it requires set-up on the command line, can't run without showing its GUI, only prints results onto the GUI and also takes ages to run.
PIT is the standard Java mutation tester. From their site:
Mutation testing is conceptually quite simple.
Faults (or mutations) are automatically seeded into your code, then your tests are run. If your tests fail then the mutation is killed, if your tests pass then the mutation lived.
...
Traditional test coverage (i.e line, statement, branch etc) measures only which code is executed by your tests. It does not check that your tests are actually able to detect faults in the executed code. It is therefore only able to identify code the is definitely not tested.
The most extreme example of the problem are tests with no assertions. Fortunately these are uncommon in most code bases. Much more common is code that is only partially tested by its suite. A suite that only partially tests code can still execute all its branches (examples).
As it is actually able to detect whether each statement is meaningfully tested, mutation testing is the gold standard against which all other types of coverage are measured.
The quality of your tests can be gauged from the percentage of mutations killed.
It has a corresponding Maven plugin to make it simple to integrate as part of a CI build. I believe the next version will also include proper integration with Maven site reports too.
Additionally, the creator/maintainer is pretty active here on StackOverflow, and is good about responding to tagged questions.
As far as possible, write each test before implementing the feature or fixing the bug the test is supposed to deal with. The sequence for a feature or bug fix becomes:
Write a test.
Run it. At this point it will fail if it is a good test. If it does
not fail, change, replace, or add to it.
When you have a failing test, implement the feature it is supposed
to test. Now it should pass.
You have various options:
You probably could use some code analysis tool like checkstyle to verify that each test has an assertion. Or alternatively use a JUnit Rule to verify this, but both is easily tricked and works only on a superficial level.
Mutation testing as Jester does is again a technical solution which would work, and it seems #Tom_G has a tool that might work. But these tools are (in my experience) extremely slow, because the work by changing the code, running tests, analyzing result over and over again. So even tiny code bases take lots of time and I wouldn't even think about using it in a real project.
Code Reviews: such bad tests are easily caught by code reviews, and they should be part of every development process anyway.
All this still only scratches on the surface. The big question you should ponder is: why do developers feel tempted to create code just to start a certain part of the application? Why don't they write tests for what they want to implement, so there is almost no need for starting parts of the application. Get some training for automated unit testing and especially TDD/BDD, i.e. a process where you write the tests first.
In my experience it is very likely that you will hear things like: We can't test this because .... You need to find the real reason why the developers, can't or don't want to write these tests, which might or might not be the reasons they state. Then fix those reasons and those abominations of tests will go away all on their own.
What you are looking for is indeed mutation testing.
Regarding tool support, you might also want to look at the Major
mutation framework (mutation-testing.org), which is quite efficient and configurable. Major
uses a compiler-integrated mutator and gives you great control over
what should be mutated and tested. As far as I know Major does not yet
produce graphical reports but rather data (csv) files that you can
process or visualize in any way you want.
Sounds like you need to consider a coverage tool like Jacoco, the gradle plugin provides report on coverage. I also use the EclEmma Eclipse plugin to obtain the same results, but with a fairly nice integration in the IDE.
In my experience, Jacoco has provided acceptable numbers even when there are no-op unit test. As it seems able to accurately determine the tested code paths. No-op test get low or 0% coverage scores and the score increase as the test become more complete.
Update
To address the down-voter. Perhaps a more appropriate tool to address this is PMD. Can be used in an IDE or build system. With proper configuration and rule development it could be used to find these incomplete unit tests. I have used it in the past to find methods missing certain security related annotation in the past.

Measure Code Coverage only on New Code

We are looking for a creative way to measure code coverage on new code separate from existing code. We have a large legacy project and want to start getting 90+% coverage on any new functionality. We would like a way to easily view a report that filters out any older code to make sure the new functionality is meeting our goal. Obviously still looking a increasing overall coverage on the project, but need a non-manual way to give us feedback on the new code activity. We have this working for Static analysis since we can look at the dates on the source files. Since Cobertura is analyzing the class files they have new dates and this technique doesn't work.
Any Ideas?
Stack:
Java 1.5
JUnit
Cobertura
Hudson
We had a similar situation.. wanted new code tested but could not test all old code at once. What we did is not exactly what you asked, but may give you an idea.
We have a file called linecoverage.standard, and a file called branchcoverage.standard that live on the build server (and local copies). They have a number inside with the current line and branch coverage limits. If the checked in code is below the standard, it fails the build. If it is at the standard it passes the build. If it is ABOVE the standard, a new standard is written equal to the current coverage.
This means our code coverage will never get worse, and should slowly go up. If new code is 90%, the coverage will keep creeping up. You could also set a goal like raise the standard by 1 each week until it gets to your final goal (90%). Having to add a few tests a week to old code is not a bad idea, if it is spread out over enough time.
Our current coverage is up to 75%ish... pretty good coming from a 0% rate under a year ago.
I did this for a large C++ project by using svn blame combined with the output of gcov. If you zip those two results together you have revision information and coverage information for each line. I actually loaded this all into a database to do queries (e.g. show me all the uncovered lines written by joe since r1234). If you only want an aggregate number you can just avoid counting 'old' uncovered lines in your total.
Have a look on emma.sourceforge and associated Eclipse plugin here (if you are using Eclipse)
I think this tool can answer to your need by selecting exactly what to test for coverage.
IMO the best option is to split the codebase into "new" and "legacy" sections. Then either run test coverage analysis only on the "new" section, or ignore the results for the "old" section.
The two best ways to accomplish this are a) split the codebase into two source trees (two projects with a dependency between), or b) maintain two separate package hierarchies in a single project.
Two separate projects is probably preferable, but it might not be possible if there's a cyclical dependency between the legacy codebase and the new codebase (old code depends on new code and new code depends on old code). If you can manage it, a one-way dependency between old and new code will also make the combined codebase easier to understand.
Once you've got this done, either adjust cobertura so that it's only analyzing the bits you want, or at least just focus on the "new" part of the codebase. One additional tip is that in this scheme, it's best to move bits of code from the "legacy" section to the "new" section as you refactor/add tests to them (if code is frequently moving in the other direction, that's not so good :-).
We did it as below using sonar.exclusions property:
We use Sonar to display the code coverage reports (reported by Cobertura).
a) Identify the classes that you don't want coverage report on (Legacy classes)
Use your SCM cmd line client.
eg: p4 files //depot/... #2000/01/01,#2013/07/13
git log --until="5 days ago"
Direct this list into a file.
You will need to do some parsing based on the SCM tool you use and your destination file should contain one file name per line.
eg. the destination file is excludeFile.list should look like below:
abc.java
xyz.java
...
b) Now..when you integrate with Sonar (from Jenkins Job), use the below property.
-Dsonar.exclusions=<filename>
And your final coverage report in Sonar contains only your new classes ( added after 07/13 in the above example).
We call what you are trying to do Test Gap analysis. The idea is to test all (or at least most of) the changes you make to a large software system during development, because that's where the most bugs will be. There's empirical evidence to back up this intuition as well!
Teamscale is a tool that does what you are looking for and it can handle Cobertura reports. The advantage is, that you just measure coverage as you normally do and then upload the reports to Teamscale, which will perform the Test Gap analysis to highlight new/changed but untested code on a method-by-method basis.
Full disclaimer: I work for CQSE, the company that makes Teamscale.
In my scenario, we need to have the measureament of new code in our day-to-day process. What we did was install sonarquobe locally where the developers can check their code quality control, such as the code coverage of the new code as the sonar can provide to us, and make actions right away.
For a global metrics, we implemented sonarquobe for only our production code and we gather from there all of the quality metrics (such as new code coverage)

How to fail the build when there is new uncovered code?

Do any code coverage tools for Java allow you to cause the build to fail when new uncovered code gets introduced? I don't want to fail the build based on an arbitrary cutoff like 80% because in a large codebase, the actual coverage percentage rarely fluctuates. Also if coverage falls by 0.1% it's hard to tell which are the new uncovered lines.
EDIT
I'm convinced not to fail the build. The other part of the question still stands. How can I find only the uncovered code that was recently checked in?
If you are using a continuous integration server such as Hudson, you can delegate this requirement to a new job which is dependent on the build (which runs during each commit, say).
Create a script which runs your code coverage profile, and fails based on a metric. Include a wget or cURL retrieval of the previous build's code coverage percent, parsed out, if you want to use an automatic metric.
Hudson cobertura plugin let you to rise build warnings and the native "changes" view will tell you what is the new code that do not have coverage

Is there an automated way to make sure that all parts of code is unit tested?

I have written JUnit tests for my class, and would like it to tell me if there is any part of my code that is not unit tested. Is there a way to do this?
Yes, coverage tools like cobertura or emma.
They create reports that show every line in the source code and whether it was executed or not (and aggregated statistics as well).
Of course, they can only show you if the code was run. There is no way to tell if the unit test contained assertions to confirm that the result was correct.
You need some code coverage tools. See here (http://java-source.net/open-source/code-coverage) for some
If you look at the first one I think it does what you need
Cobertura is a free Java tool that calculates the percentage of code accessed by tests. It can be used to identify which parts of your Java program are lacking test coverage. It is based on jcoverage. Features of Cobertura:
Can be executed from ant or from the
command line.
If you use Eclipse, you can also try EclEmma, which shows you which lines of source were covered by your test. This is sometimes more useful than running a coverage tool like Cobertura because you can run a single test from inside Eclipse and then get immediate feedback on what was covered.
Your headline and your actual question differ. The tools mentioned in the other answers can tell you, which part of the code were not tested (=not executed at all). Making "make sure that all parts of code is unit tested" is a different thing. The coverage tools can tell you whether all lines/instructions have been executed, but they don't guarantee that everything is tested functionally (all constellations of data, all execution paths, etc.). This requires some brain power.
In my opinion, test coverage often gives a wrong feeling of safety. E.g. testing trivial getters increases coverage a lot but is rather useless.
If you are using IntelliJ then there is a button titled
"Run With Coverage"

Can my build stipulate that my code coverage never get worse?

I am using hudson CI to manage a straight java web project, using ant to build.
I would like to mandate that the unit test coverage never be worse than the previous build, thereby making sure any new code is always tested, or at least the coverage is continually improving.
Is there a hudson plugin that works this way?
Edit: I am currently using Emma, but would be willing to switch to another coverage app.
Also, as a clarification, I've seen the thresholds in some Hudson plugins, but that's not exactly what I'm after. For example what I'd like is that if coverage for Build #12 was 46% overall, and someone checked in Build #13 with 45% coverage, the build would break.
The reason I want to do this, is that I have a codebase with low test coverage. We don't have time to go back and retroactively write unit tests, but I'd like to make sure that the coverage keeps getting better.
UPDATE: Dan pointed out an edge case with my plan that will definitely be a problem. I think I need to rethink whether this is even a good idea.
Yes. Which coverage tool are you using?
The Cobertura plugin for Hudson definitely supports this. On the project configuration screen you can specify thresholds.
Alternatively, you can make Ant fail the build (rather than Hudson), by using the cobertura-check task.
EDIT: I'm not sure you can do precisely what you are asking for. Even if you could, it could prove problematic. For example, assume you have an average coverage of 75% but for one class you have coverage of 80%. If you remove that 80% class and all of its tests, you reduce the overall coverage percentage even though none of the other code is any less tested than previously.
This is kind of a hack, but we use it for similar reasons with Findbugs and Checkstyle. You can set up an Ant task to do the following (this can be split out into multiple tasks, but I'm combining them for brevity):
Run tests with coverage
Parse the coverage results and get the coverage percentage
Read tmp/lastCoverage.txt from last build (see step #5a)
Compare the current coverage percentage with the percentage read from lastCoverage.txt
If percentage DIDN'T decrease, write the new percentage over the contents of tmp/lastCoverage.txt
If percentage DID decrease, keep the original file and echo "COVERAGE FAILURE" (with ant's echo task).
Note that steps 2 through 5 don't necessarily need to be done with native Ant tasks - you could use something like Ant's javac task to run a Java program to do this for you.
Then, configure Hudson:
Under "Source code management", make sure "Use Update" is checked. This will allow your lastCoverage.txt file to be retained between builds. Note that this could be problematic if you really, really need things to be cleaned between builds.
Use the Hudson Text Finder plugin with a regular expression to search for "COVERAGE FAILURE" in the build output (make sure that "Also search console output" is checked for the plugin). The text finder plugin can mark the build unstable.
You can obviously replace things like the file name/path and console output to whatever fits within the context of your build.
As I mentioned above, this is rather hacky, but it's probably one of the few (only?) ways to get Hudson to compare things in the previous build to the current build.
Another approach would be to use the Sonar plugin for Hudson to maintain trending of coverage over time, and make it easier to assimilate and analyze results. It will also show coverage in context of other measures, such as checkstyle and pmd
Atlassian's Clover supports what you want. Have a look at the clover-check Ant task, specifically the historyDir attribute.

Categories

Resources