Finding #Ignore'd tests that are now passing

Finding #Ignore'd tests that are now passing - java

For a decent sized open source project where developers come and go, someone may fix a bug without realizing that someone else a while back committed a disabled unit test (a la #Ignore). We'd like to find the passing tests that are disabled so we can enable them and update the bug tracker, CC list, and anything else downstream.
What is the best way to occasionally run all #Ignore'd tests and identify the ones that are now passing? We are using Java 1.6 with JUnit4, building our project with ant and transitioning to gradle. We use Jenkins for CI.
A few ideas:
Permanently replace all of our #Ignore annotations with a conditional ignore
http://www.codeaffine.com/2013/11/18/a-junit-rule-to-conditionally-ignore-tests/
Run a custom JUnit4 class runner that changes the behavior of #Ignore.
https://stackoverflow.com/a/42520871
Temporarily comment out all #Ignore annotations so that they run. However we'd need a way to negate the failures.

Sorry, this is not a solution, but rather another alternative that has worked for me:
My key point was to not modify existing (1000s) of unit tests. So no broad code changes. No new Annotations, certainly not temporarily.
What I did was override the JUnit #Ignore detection and make that conditional, via classpath prepends: Check in a separate control file if that test/class is listed or disabled. This is based on package/FQCN/method name and Regexp patterns. If covered, run it even though it still has #Ignore in the unchanged original JUNit Test source.
Log the outcome, amend the control file. Rinse and repeat.

Related

Maven-surefire-plugin tests fail in Jenkins build but run successfully locally?

I have a maven project with test execution by the maven-surefire-plugin. An odd phenomenon I've observed and been dealing with is that running locally
mvn clean install
which executes my tests, results in a successful build with 0 Failures and 0 Errors.
Now when I deploy this application to our remote repo that Jenkins attempts to build, I get all sorts of random EasyMock errors, typically of the sort:
java.lang.IllegalStateException: 3 matchers expected, 4 recorded. at org.easymock.internal.ExpectedInvocation.createMissingMatchers
This is a legacy application being inherited, and we are aware that many of these tests are flawed if not plainly using EasyMock incorrectly, but I'm in a state where with test execution I get a successful build locally but not in Jenkins.
I know that the order of execution of these tests is not guaranteed, but I am wondering how I can introspect what is different in the Jenkins build pipeline vs. local to help identify the issue?
Is there anything I can do to force execute the tests in the way they're done locally? At this point, I have simply excluded many troublesome test classes but it seems that no matter how many times I see a Jenkins failure, I either fix the problem or exclude the test class, I'm only to find it complain about some other test class it didn't mention before.
Any ideas how to approach a situation like this?

I have experimented quite a similar situation, and the cause of mine was obviously some concurrency problems with the tests implementations.
And, after reading your comment:
What I actually did that fixed it (like magic am I right?) is for the maven-surefire plugin, I set the property reuseForks=false, and forkCount=1C, which is just 1*(number of CPU's of machine).
... I get more convinced that you have concurrency problems with your tests. Concurrency is not easy to diagnose, specially when your experiment runs OK on one CPU. But race conditions might arise when you run it on another system (which usually is faster or slower).
I recommend you strongly to review your tests one by one and ensure that each one of them is logically isolated:
They should not rely upon an expected previous state (files, database, etc). Instead, they should prepare the proper setup before each execution.
If they modify concurrently a common resource which might interfere other test's execution (files, database, singletons, etc), every assert must be done synchronizing as much as needed, and taking in account that its initial state is unknown:
Wrong test:
MySingleton.getInstance().put(myObject);
assertEquals(1, MySingleton.getInstance().size());
Right test:
synchronized(MySingleton.getInstance())
{
MySingleton.getInstance().put(myObject);
assertTrue(MySingleton.getInstance().contains(myObject));
}
A good start point for the reviewing is checking one of the failing tests and track the execution backwards to find the root cause of the fail.
Setting explicitly the tests' order is not a good practice, and I wouldn't recommend it to you even if I knew it was possible, because it only would hide the actual cause of the problem. Think that, in a real production environment, the executions' order is not usually guranteed.

JUnit test run order is non-deterministic.
Are the versions of Java and Maven the same on the 2 machines? If yes, make sure you're using the most recent maven-surefire-plugin version. Also, make sure to use a Freestyle Jenkins job with a Maven build step instead of the Maven project type. Using the proper Jenkins build type can either fix build problems outright or give you a better error so you can diagnose the actual issue.
You can turn on Maven debug logging to see the order tests are being run in. Each test should set up (and perhaps tear down) its own test data to make sure the tests may run independently. Perhaps seeing the test order will give you some clues as to which classes depend on others inappropriately. And - if the app uses caching, ensure the cache is cleaned out between tests (or explicitly populated depending on what the test needs to do). Also consider running the tests one package at a time to isolate the culprits - multiple surefile plugin executions might be useful.
Also check the app for classpath problems. This answer has some suggestions for cleaning the classpath.
And another possibility: Switching to a later version of JUnit might help - unless the app is using Spring 2.5.6.x. If the app is using Spring 2.5.6.x and cannot upgrade, the highest possible version of JUnit 4.x that may be used is 4.4. Later versions of JUnit are not compatible with Spring Test 2.5.6 and may lead to hard-to-diagnose test errors.

JUnit report to show test functionality, not coverage

One of the problems of a team lead is that people on the team (sometimes even including myself) often create JUnit tests without any testing functionality.
It's easily done since the developers use their JUnit test as a harness to launch the part of the application they are coding, and then either deliberately or forgetfully just check it in without any assert tests or mock verifies.
Then later it gets forgotten that the tests are incomplete, yet they pass and produce great code coverage. Running up the application and feeding data through it will create high code coverage stats from Cobertura or Jacoco and yet nothing is tested except its ability to run without blowing up - and I've even seen that worked-around with big try-catch blocks in the test.
Is there a reporting tool out there which will test the tests, so that I don't need to review the test code so often?
I was temporarily excited to find Jester which tests the tests by changing the code under test (e.g. an if clause) and re-running it to see if it breaks the test.
However this isn't something you could set up to run on a CI server - it requires set-up on the command line, can't run without showing its GUI, only prints results onto the GUI and also takes ages to run.

PIT is the standard Java mutation tester. From their site:
Mutation testing is conceptually quite simple.
Faults (or mutations) are automatically seeded into your code, then your tests are run. If your tests fail then the mutation is killed, if your tests pass then the mutation lived.
...
Traditional test coverage (i.e line, statement, branch etc) measures only which code is executed by your tests. It does not check that your tests are actually able to detect faults in the executed code. It is therefore only able to identify code the is definitely not tested.
The most extreme example of the problem are tests with no assertions. Fortunately these are uncommon in most code bases. Much more common is code that is only partially tested by its suite. A suite that only partially tests code can still execute all its branches (examples).
As it is actually able to detect whether each statement is meaningfully tested, mutation testing is the gold standard against which all other types of coverage are measured.
The quality of your tests can be gauged from the percentage of mutations killed.
It has a corresponding Maven plugin to make it simple to integrate as part of a CI build. I believe the next version will also include proper integration with Maven site reports too.
Additionally, the creator/maintainer is pretty active here on StackOverflow, and is good about responding to tagged questions.

As far as possible, write each test before implementing the feature or fixing the bug the test is supposed to deal with. The sequence for a feature or bug fix becomes:
Write a test.
Run it. At this point it will fail if it is a good test. If it does
not fail, change, replace, or add to it.
When you have a failing test, implement the feature it is supposed
to test. Now it should pass.

You have various options:
You probably could use some code analysis tool like checkstyle to verify that each test has an assertion. Or alternatively use a JUnit Rule to verify this, but both is easily tricked and works only on a superficial level.
Mutation testing as Jester does is again a technical solution which would work, and it seems #Tom_G has a tool that might work. But these tools are (in my experience) extremely slow, because the work by changing the code, running tests, analyzing result over and over again. So even tiny code bases take lots of time and I wouldn't even think about using it in a real project.
Code Reviews: such bad tests are easily caught by code reviews, and they should be part of every development process anyway.
All this still only scratches on the surface. The big question you should ponder is: why do developers feel tempted to create code just to start a certain part of the application? Why don't they write tests for what they want to implement, so there is almost no need for starting parts of the application. Get some training for automated unit testing and especially TDD/BDD, i.e. a process where you write the tests first.
In my experience it is very likely that you will hear things like: We can't test this because .... You need to find the real reason why the developers, can't or don't want to write these tests, which might or might not be the reasons they state. Then fix those reasons and those abominations of tests will go away all on their own.

What you are looking for is indeed mutation testing.
Regarding tool support, you might also want to look at the Major
mutation framework (mutation-testing.org), which is quite efficient and configurable. Major
uses a compiler-integrated mutator and gives you great control over
what should be mutated and tested. As far as I know Major does not yet
produce graphical reports but rather data (csv) files that you can
process or visualize in any way you want.

Sounds like you need to consider a coverage tool like Jacoco, the gradle plugin provides report on coverage. I also use the EclEmma Eclipse plugin to obtain the same results, but with a fairly nice integration in the IDE.
In my experience, Jacoco has provided acceptable numbers even when there are no-op unit test. As it seems able to accurately determine the tested code paths. No-op test get low or 0% coverage scores and the score increase as the test become more complete.
Update
To address the down-voter. Perhaps a more appropriate tool to address this is PMD. Can be used in an IDE or build system. With proper configuration and rule development it could be used to find these incomplete unit tests. I have used it in the past to find methods missing certain security related annotation in the past.

Ensure minimal coverage on new Subversion commits

We have a massive project with almost no unit tests at all. I would like to ensure from now on that the developers commit new features (or bugs!) without minimal coverage for corresponding unit tests.
What are some ways to enforce this?
We use many tools, so perhaps I can use a plugin (jira, greenhopper, fisheye, sonar, hudson). I was also thinking perhaps a Subversion pre-commit hook, the Commit Acceptance Plugin for jira, or something equivalent.
Thoughts?

Sonar (wonderful tool by the way) with Build breaker plugin can break your Hudson build when some metrics don't meet specified rules. You can setup such a rule in Sonar that would trigger an alert (eventually causing the build to fail) when the coverage is below given point. The only drawback is that you probably want the coverage to grow, so you must remember to increase the alert level every day to the current value.

What you want to do is determine what is new code, and verify that the new code is covered by some test.
Determining code coverage in general can be accomplished with any of a variety of test coverage tools. Many test coverage tools can simply reinstrument your entire application and then you can run tests to determine coverage.
Our (Semantic Designs') line of Test Coverage tools can determine, from a changed-file list, just the individual files that need to re-instrumented, and with careful test organization, just the tests that need to be reexecuted. This will minimize the cost of re-running your tests, and you'll still end
with the same overall coverage data. (Actually, these tools detect what tests need to be made based on changes at the method level).
Once you have test coverage data, what you want to know is the the specifically new code is covered by some tests. You can do this sloppily with just test coverage data if you know which files changed, by insisting the changed files have 100% coverage. That probably doesn't work in practice.
You could instead take advantage of SD's Smart Differencer tools to give a more precise answer. These tools compare two language files, and indicate where the changes are using the language syntax (e.g., expression, statement, declaration, method body, not just changed source lines) and conceptual editing operations (move, copy, delete, insert, rename-identifier-within-block). SmartDifferencer deltas tend to be both smaller and finer than what you would get from a plain diff tool.
It is easy to extract from the SmartDifferencer's output a list of lines changed. One could compute the intersection of that, per file, with the lines covered by the test coverage data. If the changed-lines are not all entirely within the set of covered lines, then "new" code hasn't been tested and you can raise a flag, stop a checkin, or whatever to signal that your checking policy has been violated.
The TestCoverage and SmartDifferencer tools don't come out-of-the-box with this computation done for you, but it should be a pretty easy script to implement.

if you use maven - cobertura plugin can be a good choice ( and not so annoying for developers as svn hook )
http://mojo.codehaus.org/cobertura-maven-plugin/usage.html

Delete or comment out non-working JUnit tests?

I'm currently building a CI build script for a legacy application. There are sporadic JUnit tests available and I will be integrating a JUnit execution of all tests into the CI build. However, I'm wondering what to do with the 100'ish failures I'm encountering in the non-maintained JUnit tests. Do I:
1) Comment them out as they appear to have reasonable, if unmaintained, business logic in them in the hopes that someone eventually uncomments them and fixes them
2) Delete them as its unlikely that anyone will fix them and the commented out code will only be ignored or be clutter for evermore
3) Track down those who have left this mess in my hands and whack them over the heads with the printouts of the code (which due to long-method smell will be sufficently suited to the task) while preaching the benefits of a well maintained and unit tested code base

If you use Junit 4 you can annotate that tests with #Ignore annotation.
If you use JUnit 3 you can just rename tests so they don't start with test.
Also, try to fix tests for functionality you are modifying in order to not make code mess larger.

Follow the no broken window principle and take some action towards a solution of the problem. If you can't fix the tests, at least:
Ignore them from the unit tests (there are different ways to do this).
Enter as many issue as necessary and assign people to fix the tests.
Then to prevent such situation from happening in the future, install a plug in similar to Hudson Game Plugin. People gets assigned points during continuous integration, e.g.
-10 break the build <-- the worse
-1 break a test
+1 fix a test
etc.
Really cool tool to create a sense of responsibility about unit tests within a team.

The failing JUnit tests indicate that either
The source code under test has been worked on without the tests being maintained. In this case option 3 is definitely worth considering, or
You have a genuine failure.
Either way you need to fix/review the tests/source. Since it sounds like your job is to create the CI system and not to fix the tests, in your position i would leave a time-bomb in the tests. You can get very fancy with annotated methods with JUnit 4 (something like #IgnoreUntil(date="2010/09/16")) and a custom runner, so or you can simply add an an if statement to the first line of each test:
if (isBeforeTimeBomb()) {
return;
}
Where isBeforeTimeBomb() can simply check the current date against a future date of your choosing. Then you follow the advice given by others here and notify your development team that the build is green now, but is likely to explode in X days unless the timebombed tests are fixed.

Comment them out so that they can be fixed later.
Generate test coverage reports (with Cobertura for example). The methods that were supposed to be covered by the tests that you commented out will then be indicated as not covered by tests.

If they compile but fail: leave them in. That will get you a good history of test improvements over time when using CI. If the tests do not compile but break the build, comment them out and poke the developers to fix them.
This obviously does not preclude using option 3 (hitting them over the head), you should do that anyway, regardless of what you do with the tests.

You should definitely disable them in some way for now. Whether that's done by commenting, deleting (assuming you can get them back from source control) or some other means is up to you. You do not want these failing tests to be an obstacle for people trying to submit new changes.
If there are few enough that you feel you can fix them yourself, great -- do it. If there are too many of them, then I'd be inclined to use a "crowdsourcing" approach. File a bug for each failing test. Try to assign these bugs to the actual owners/authors of the tests/tested code if possible, but if that's too hard to determine then randomly selecting is fine as long as you tell people to reassign the bugs that were mis-assigned to them. Then encourage people to fix these bugs either by giving them a deadline or by periodically notifying everyone of the progress and encouraging them to fix all of the bugs.

A CI system that is steady red is pretty worthless. The main benefit is to maintain a quality bar, and that's made much more difficult if there's no transition to mark a quality drop.
So the immediate effort should be to disable the failing tests, and create a tracking ticket/work item for each. Each of those is resolved however you do triage - if nobody cares about the test, get rid of it. If the failure represents a problem that needs to be addressed before ship, then leave the test disabled.
Once you are in this state, you can now rely on the CI system to tell you that urgent action is required - roll back the last change, or immediately put a team on fixing the problem, or whatever.

I don't know your position in the company, but if it's possible leave them in and file the problems as errors in your ticket system. Leave it up to the developers to either fix them or remove the tests.
If that doesn't work remove them (you have version control, right?) and close the ticket with a comment like 'removed failing junit tests which apparently won't be fixed' or something a bit more polite.
The point is, junit tests are application code and as such should work. That's what developers get paid for. If a test isn't appropriate anymore (something that doesn't exist anymore got tested) developers should signal that and remove the test.

Run JUnit automatically when building Eclipse project

I want to run my unit tests automatically when I save my Eclipse project. The project is built automatically whenever I save a file, so I think this should be possible in some way.
How do I do it? Is the only option really to get an ant script and change the project build to use the ant script with targets build and compile?
Update I will try 2 different approaches now:
Running an additional builder for my project that executes the ant target test (I have an ant script anyway)
ct-eclipse, recommended by Thorbjørn

For sure it it unwise to run all tests, because we can have for example 20.000 tests whereas our change could affect only, let's say 50 of them, among which are tests for the class we have changed and tests for classes that collaborate with our class.
There is an unseful plugin called infinitetest http://improvingworks.com/products/infinitest/ which runs only some tests ( related to class we've changed ) just after we save changes. It also integrate quite nicely with editor ( using annotations ) and problem view - displaying not-passing tests like errors.

Right click on your project > Properties > Builders > New, and there add your ant ant builder.
But, in my opinion, it is unwise to run the unit tests on each save.

See if Eclipse has a plugin for Infinitest.
I'd also consider TestNG as an alternative to JUnit. It has a lot of features that might be helpful in partitioning your unit test classes into shorter and longer running groups.

I believe you are looking for http://ct-eclipse.tigris.org/
I've experimented with the concept earlier, and my personal conclusion was that in order for this to be useful you need a lot of tests which take time. Personally I save very frequently so this would happen frequently, and I didn't find it to be an advantage. It might be different for you.
Instead we bit the bullet and set up a "build server" which watches our CVS repository and builds projects as they change. If the compilation fails or the tests fail we are notified quickly so we can remedy it.
It is as always a matter of taste what works for you. This is what I've found.

I would recommend Inifinitest for the described situation. Infinitest is nowadays a GPL v3 licensed product. Eclipse update site: http://infinitest.github.com

Then you must use INFINITEST. INFINITEST helps you to do Continuous Testing.
Whenever you make a change, Infinitest runs tests for you.
It selects tests intelligently, and only runs the ones you need. It reports unit test failures like compiler errors, and provides additional information that helps you write better tests.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.