I have a set of legacy unit tests, most of which are Spring AbstractTransactionalJUnit4SpringContextTests tests, but some manage transactions on their own. Unfortunately, this seems to have introduced side-effects causing completely unrelated tests to fail when modifying the test data set, i.e., the failing test works when running it on its own (with the same initial data set), but fails when being run as part of the complete set of tests.
The tests are typically run through Maven's surefire plugin during the regular Maven build.
What I am looking for is an automated way to permute the amount and order of the executed tests to figure out the culprit. A naive, but pretty expensive approach, would take the power set of all tests and run all possible combinations. A more optimized approach would use the existing test execution order (which is mostly random, but stable) and test all potential ordered sub-sets. I am aware that the runtime of this process may be lengthy.
Are there any tools / Maven plugins that can do this out of the box?
I don't know of a tool which does specifically what you want, but you could play about with the runOrder parameter in maven surefire. From that page:
Defines the order the tests will be run in. Supported values are
"alphabetical", "reversealphabetical", "random", "hourly"
(alphabetical on even hours, reverse alphabetical on odd hours),
"failedfirst", "balanced" and "filesystem".
Odd/Even for hourly is
determined at the time the of scanning the classpath, meaning it could
change during a multi-module build.
So you could do a simple alphabetical runOrder and take the first failure, and start from there. At least you have a predictable run order. Then you run one by one (using -Dincludes) each test before the failing one & the failing one, to detect which one is making the failing test fail.
Then repeat the entire process for all of the failing tests. You could run this in a loop overnight or something.
Can you simply amend the tests to use a clean database copy each time? DBUnit is an excellent tool for doing this.
http://www.dbunit.org/
Related
Preface
I'm deliberatly talking about system tests. We do have a rather exhaustive suite of unit tests, some of which use mocking, and those aren't going anywhere. The system tests are supposed to complement the unit tests and as such mocking is not an option.
The Problem
I have a rather complex system that only communicates via REST and websocket events.
My team has a rather large collection of (historically grown) system tests based JUnit.
I'm currently migrating this codebase to JUnit5.
The tests usually consist of an #BeforeAll in which the system is started in a configuration specific to the test-class, which takes around a minute. Then there is a number of independent tests on this system.
The problem we routinely run into is that booting the system takes a considerable amount of time and may even fail. One could say that the booting itself can be considered a test-case. JUnit handles lifecycle methods kind of weirdly - the time they take isn't shown in the report; if they fail it messes with the count of tests; it's not descriptive; etc.
I'm currently looking for a workaround, but what my team has done over the last few years is kind of orthogonal to the core idea of JUnit (cause it's a unit testing framework).
Those problems would go away if I replaced the #BeforeAllwith a test-method (let's call it #Test public void boot(){...}) and introduce an order-dependency (which is pretty easy using JUnit 5) that enforces boot to run before any other test is run.
So far so good! This looks and works great. The actual problem starts when the tests aren't executed by the CI server but by developers who try to troubleshoot. When I try to start a single test boot is filtered from the test execution and the test fails.
Is there any solution to this in JUnit5? Or is there a completely different approach I should take?
I suspect there may be a solution in using #TestTemplate but I'm really not sure how to procede. Also afaik that would only allow me to generate new named tests that would be filtered as well. Do I have to write a custom test-engine? That doesn't seem compelling.
This more general testing problem then related to Junit5. In order to skip very long boot up you can mock some components if it is possible. Having the booting system as a test does not make sense because there are other tests depending on that. Better to use #beforeAll in this case as it was before. For testing boot up, you can make separate test class for that which will run completely independent from other tests.
Another option is to group this kind of test and separate from the plain unit test and run it only if needed (for example before deployment on CI server). This really depends on specific use case and should those test be part of regular build on your local machine.
The third option is to try to reduce boot time if it possible. This is option if you can't use mocks/stubs or exclude those tests from regular build.
I have some features to test using Gherkin and Cucumber. The thing is that the execution is random, and since, for example, the first scenario is creating elements on the page, second one is looking for them and third moving them, all test are crashing cause the execution is going like: nÂș9 firts, then 8, then 2, then...
I am not using execution tags, or if I use them, I'm using it above "Feature:" to make sure all scenarios are running
Anyone could bring some light here?
General consensus within the test automation community is that your automated tests should be able to run independently. That is, tests should be runnable in any given order and the result of a test should not depend on the outcome of one or more previous tests. Try changing the architecture of your test cases.
It is possible to run tests in specific order using JUnit or TestNG.
https://www.ontestautomation.com/running-your-tests-in-a-specific-order/
I have a maven project with test execution by the maven-surefire-plugin. An odd phenomenon I've observed and been dealing with is that running locally
mvn clean install
which executes my tests, results in a successful build with 0 Failures and 0 Errors.
Now when I deploy this application to our remote repo that Jenkins attempts to build, I get all sorts of random EasyMock errors, typically of the sort:
java.lang.IllegalStateException: 3 matchers expected, 4 recorded. at org.easymock.internal.ExpectedInvocation.createMissingMatchers
This is a legacy application being inherited, and we are aware that many of these tests are flawed if not plainly using EasyMock incorrectly, but I'm in a state where with test execution I get a successful build locally but not in Jenkins.
I know that the order of execution of these tests is not guaranteed, but I am wondering how I can introspect what is different in the Jenkins build pipeline vs. local to help identify the issue?
Is there anything I can do to force execute the tests in the way they're done locally? At this point, I have simply excluded many troublesome test classes but it seems that no matter how many times I see a Jenkins failure, I either fix the problem or exclude the test class, I'm only to find it complain about some other test class it didn't mention before.
Any ideas how to approach a situation like this?
I have experimented quite a similar situation, and the cause of mine was obviously some concurrency problems with the tests implementations.
And, after reading your comment:
What I actually did that fixed it (like magic am I right?) is for the maven-surefire plugin, I set the property reuseForks=false, and forkCount=1C, which is just 1*(number of CPU's of machine).
... I get more convinced that you have concurrency problems with your tests. Concurrency is not easy to diagnose, specially when your experiment runs OK on one CPU. But race conditions might arise when you run it on another system (which usually is faster or slower).
I recommend you strongly to review your tests one by one and ensure that each one of them is logically isolated:
They should not rely upon an expected previous state (files, database, etc). Instead, they should prepare the proper setup before each execution.
If they modify concurrently a common resource which might interfere other test's execution (files, database, singletons, etc), every assert must be done synchronizing as much as needed, and taking in account that its initial state is unknown:
Wrong test:
MySingleton.getInstance().put(myObject);
assertEquals(1, MySingleton.getInstance().size());
Right test:
synchronized(MySingleton.getInstance())
{
MySingleton.getInstance().put(myObject);
assertTrue(MySingleton.getInstance().contains(myObject));
}
A good start point for the reviewing is checking one of the failing tests and track the execution backwards to find the root cause of the fail.
Setting explicitly the tests' order is not a good practice, and I wouldn't recommend it to you even if I knew it was possible, because it only would hide the actual cause of the problem. Think that, in a real production environment, the executions' order is not usually guranteed.
JUnit test run order is non-deterministic.
Are the versions of Java and Maven the same on the 2 machines? If yes, make sure you're using the most recent maven-surefire-plugin version. Also, make sure to use a Freestyle Jenkins job with a Maven build step instead of the Maven project type. Using the proper Jenkins build type can either fix build problems outright or give you a better error so you can diagnose the actual issue.
You can turn on Maven debug logging to see the order tests are being run in. Each test should set up (and perhaps tear down) its own test data to make sure the tests may run independently. Perhaps seeing the test order will give you some clues as to which classes depend on others inappropriately. And - if the app uses caching, ensure the cache is cleaned out between tests (or explicitly populated depending on what the test needs to do). Also consider running the tests one package at a time to isolate the culprits - multiple surefile plugin executions might be useful.
Also check the app for classpath problems. This answer has some suggestions for cleaning the classpath.
And another possibility: Switching to a later version of JUnit might help - unless the app is using Spring 2.5.6.x. If the app is using Spring 2.5.6.x and cannot upgrade, the highest possible version of JUnit 4.x that may be used is 4.4. Later versions of JUnit are not compatible with Spring Test 2.5.6 and may lead to hard-to-diagnose test errors.
One of the problems of a team lead is that people on the team (sometimes even including myself) often create JUnit tests without any testing functionality.
It's easily done since the developers use their JUnit test as a harness to launch the part of the application they are coding, and then either deliberately or forgetfully just check it in without any assert tests or mock verifies.
Then later it gets forgotten that the tests are incomplete, yet they pass and produce great code coverage. Running up the application and feeding data through it will create high code coverage stats from Cobertura or Jacoco and yet nothing is tested except its ability to run without blowing up - and I've even seen that worked-around with big try-catch blocks in the test.
Is there a reporting tool out there which will test the tests, so that I don't need to review the test code so often?
I was temporarily excited to find Jester which tests the tests by changing the code under test (e.g. an if clause) and re-running it to see if it breaks the test.
However this isn't something you could set up to run on a CI server - it requires set-up on the command line, can't run without showing its GUI, only prints results onto the GUI and also takes ages to run.
PIT is the standard Java mutation tester. From their site:
Mutation testing is conceptually quite simple.
Faults (or mutations) are automatically seeded into your code, then your tests are run. If your tests fail then the mutation is killed, if your tests pass then the mutation lived.
...
Traditional test coverage (i.e line, statement, branch etc) measures only which code is executed by your tests. It does not check that your tests are actually able to detect faults in the executed code. It is therefore only able to identify code the is definitely not tested.
The most extreme example of the problem are tests with no assertions. Fortunately these are uncommon in most code bases. Much more common is code that is only partially tested by its suite. A suite that only partially tests code can still execute all its branches (examples).
As it is actually able to detect whether each statement is meaningfully tested, mutation testing is the gold standard against which all other types of coverage are measured.
The quality of your tests can be gauged from the percentage of mutations killed.
It has a corresponding Maven plugin to make it simple to integrate as part of a CI build. I believe the next version will also include proper integration with Maven site reports too.
Additionally, the creator/maintainer is pretty active here on StackOverflow, and is good about responding to tagged questions.
As far as possible, write each test before implementing the feature or fixing the bug the test is supposed to deal with. The sequence for a feature or bug fix becomes:
Write a test.
Run it. At this point it will fail if it is a good test. If it does
not fail, change, replace, or add to it.
When you have a failing test, implement the feature it is supposed
to test. Now it should pass.
You have various options:
You probably could use some code analysis tool like checkstyle to verify that each test has an assertion. Or alternatively use a JUnit Rule to verify this, but both is easily tricked and works only on a superficial level.
Mutation testing as Jester does is again a technical solution which would work, and it seems #Tom_G has a tool that might work. But these tools are (in my experience) extremely slow, because the work by changing the code, running tests, analyzing result over and over again. So even tiny code bases take lots of time and I wouldn't even think about using it in a real project.
Code Reviews: such bad tests are easily caught by code reviews, and they should be part of every development process anyway.
All this still only scratches on the surface. The big question you should ponder is: why do developers feel tempted to create code just to start a certain part of the application? Why don't they write tests for what they want to implement, so there is almost no need for starting parts of the application. Get some training for automated unit testing and especially TDD/BDD, i.e. a process where you write the tests first.
In my experience it is very likely that you will hear things like: We can't test this because .... You need to find the real reason why the developers, can't or don't want to write these tests, which might or might not be the reasons they state. Then fix those reasons and those abominations of tests will go away all on their own.
What you are looking for is indeed mutation testing.
Regarding tool support, you might also want to look at the Major
mutation framework (mutation-testing.org), which is quite efficient and configurable. Major
uses a compiler-integrated mutator and gives you great control over
what should be mutated and tested. As far as I know Major does not yet
produce graphical reports but rather data (csv) files that you can
process or visualize in any way you want.
Sounds like you need to consider a coverage tool like Jacoco, the gradle plugin provides report on coverage. I also use the EclEmma Eclipse plugin to obtain the same results, but with a fairly nice integration in the IDE.
In my experience, Jacoco has provided acceptable numbers even when there are no-op unit test. As it seems able to accurately determine the tested code paths. No-op test get low or 0% coverage scores and the score increase as the test become more complete.
Update
To address the down-voter. Perhaps a more appropriate tool to address this is PMD. Can be used in an IDE or build system. With proper configuration and rule development it could be used to find these incomplete unit tests. I have used it in the past to find methods missing certain security related annotation in the past.
I've recently been asked to, effectively, sell my department on unit testing. I can't tell you how excited this makes me, but I do have one concern. We're using JUnit with Spring and Maven, and this means that each time mvn test is called, it rebuilds the database. Obviously, we can't integrate that with our production server -- it would kill valuable data.
How do I prevent the rebuilding without telling maven to skip testing?
The best I could figure was to assign the script to operate in a test database (line breaks added for readability):
mvn test
-Ddbunit.schema=<database>test
-Djdbc.url=jdbc:mysql://localhost/<database>test?
createDatabaseIfNotExist=true&
useUnicode=true&characterEncoding=utf-8
I can't help but think there must be a better way.
I'm especially interested in learning if there is an easy way to tell Maven to only run tests on particular classes without building anything else? mvn -Dtest=<test-name> test still rebuilds the database.
======= update =======
Bit of egg on my face here. I didn't realize that I was using the same variable in two places, meaning that the POM was using a "skip.test" variable for both rebuilding the database and for running the tests...
Update: I guess that DBUnit does the rebuilding of the DB because it is told to do so in the test setup method. If you change your setup method, you can eliminate the DB rebuild. Of course, you should do it so that you get the DB reset when you need it, and omit it when you don't. My first bet would be to use a system property to control this. You can set the property on the command line the same way you already do with jdbc.url et al. Then in the setup method you add an if to test for that property and do the DB reset if it is set.
A test database, completely separated from your production DB is definitely the best choice if you can have it. You can even use e.g. Derby, an in-memory DB which can run embedded within the JVM. But in case you absolutely can't have a separate DB, use at least a separate test schema inside that DB.
In this scenario I would recommend you put your DB connection parameters into profiles within your pom, the default being the test DB, and a separate profile to contain the production settings. This way it can never happen that you accidentally run your tests against the production DB.
In general, however, it is also important to understand that tests run against a DB are not really unit tests in the strict sense, rather integration tests. If you have an existing set of such tests, fine, use them as much as you can. However, you should try to move towards adding more real unit tests, which test only a small, isolated portion of your code at once (a method or class at most), ideally self contained (need no DB, net, config files etc.) so they can run fast - this is a very important point. If you have 5000 unit tests and each takes only 5 seconds to run, that totals up to almost 7 hours, so you obviously won't run them very often. If a test takes only 5 milliseconds, you get the results in less than half a minute, so you can afford to run all your tests before you commit your latest change - many times a day. That makes a huge difference in the speed of feedback you get from the tests.
Hope this helps.
We're using JUnit with Spring and Maven, and this means that each time mvn test is called, it rebuilds the database.
Maven doesn't do anything with databases by itself, your code does. In any case, it's very unusual to run tests (which are not unit tests) against a production database.
How do I prevent the rebuilding without telling maven to skip testing?
Hard to say without more details (you're not showing anything) but profiles might be a way to go.
Unit tests, by definition, only operate on a single component in the system. You should not be attempting to write unit tests which integrate with any external services (web, DB, etc.). The solution I have to this is to use a good mocking framework to stub out the behaviour of any dependencies your components have. This encourages good interface APIs since most mocking frameworks work best with simple interfaces. It would be best to create a Repository pattern interface for any interactions with your DB and then mock out the impl any time you are testing a class that interacts with it. You can then functionally test your Repository impl separately. This also has the added benefit of keeping your unit tests fast enough to remain part of your CI so that your feedback cycle is as fast as possible.