So I've read the official JUnit docs, which contain a plethora of examples, but (as with many things) I have Eclipse fired up and I am writing my first JUnit test, and I'm choking on some basic design/conceptual issues.
So if my WidgetUnitTest is testing a target called Widget, I assume I'll need to create a fair number of Widgets to use throughout the test methods. Should I be constructing these Widgets in the WidgetUnitTest constructor, or in the setUp() method? Should there be a 1:1 ratio of Widgets to test methods, or do best practices dictate reusing Widgets as much as possible?
Finally, how much granularity should exist between asserts/fails and test methods? A purist might argue that 1-and-only-1 assertions should exist inside a test method, however under that paradigm, if Widget has a getter called getBuzz(), I'll end up with 20 different test methods for getBuzz() with names like
#Test
public void testGetBuzzWhenFooIsNullAndFizzIsNonNegative() { ... }
As opposed to 1 method that tests a multitude of scenarios and hosts a multitude of assertions:
#Test
public void testGetBuzz() { ... }
Thanks for any insight from some JUnit maestros!
Pattern
Interesting question. First of all - my ultimate test pattern configured in IDE:
#Test
public void shouldDoSomethingWhenSomeEventOccurs() throws Exception
{
//given
//when
//then
}
I am always starting with this code (smart people call it BDD).
In given I place test setup unique for each test.
when is ideally a single line - the thing you are testing.
then should contain assertions.
I am not a single assertion advocate, however you should test only single aspect of a behavior. For instance if the the method should return something and also has some side effects, create two tests with same given and when sections.
Also the test pattern includes throws Exception. This is to handle annoying checked exceptions in Java. If you test some code that throws them, you won't be bothered by the compiler. Of course if the test throws an exception it fails.
Setup
Test setup is very important. On one hand it is reasonable to extract common code and place it in setup()/#Before method. However note that when reading a test (and readability is the biggest value in unit testing!) it is easy to miss setup code hanging somewhere at the beginning of the test case. So relevant test setup (for instance you can create widget in different ways) should go to test method, but infrastructure (setting up common mocks, starting embedded test database, etc.) should be extracted. Once again to improve readability.
Also are you aware that JUnit creates new instance of test case class per each test? So even if you create your CUT (class under test) in the constructor, the constructor is called before each test. Kind of annoying.
Granularity
First name your test and think what use-case or functionality you want to test, never think in terms of:
this is a Foo class having bar() and buzz() methods so I create FooTest with testBar() and testBuzz(). Oh dear, I need to test two execution paths throughout bar() - so let us create testBar1() and testBar2().
shouldTurnOffEngineWhenOutOfFuel() is good, testEngine17() is bad.
More on naming
What does the testGetBuzzWhenFooIsNullAndFizzIsNonNegative name tell about the test? I know it tests something, but why? And don't you think the details are too intimate? How about:
#Test shouldReturnDisabledBuzzWhenFooNotProvidedAndFizzNotNegative`
It both describes the input in a meaningful manner and your intent (assuming disabled buzz is some sort of buzz status/type). Also note we no longer hardcode getBuzz() method name and null contract for Foo (instead we say: when Foo is not provided). What if you replace null with null object pattern in the future?
Also don't be afraid of 20 different test methods for getBuzz(). Instead think of 20 different use cases you are testing. However if your test case class grows too big (since it is typically much larger than tested class), extract into several test cases. Once again: FooHappyPathTest, FooBogusInput and FooCornerCases are good, Foo1Test and Foo2Test are bad.
Readability
Strive for short and descriptive names. Few lines in given and few in then. That's it. Create builders and internal DSLs, extract methods, write custom matchers and assertions. The test should be even more readable than production code. Don't over-mock.
I find it useful to first write a series of empty well-named test case methods. Then I go back to the first one. If I still understand what was I suppose to test under what conditions, I implement the test building a class API in the meantime. Then I implement that API. Smart people call it TDD (see below).
Recommended reading:
Growing Object-Oriented Software, Guided by Tests
Unit Testing in Java: How Tests Drive the Code
Clean Code: A Handbook of Agile Software Craftsmanship
You would create a new instance of the class under test in your setup method. You want each test to be able to execute independently without having to worry about any unwanted state in the object under test from another previous test.
I would recommend having separate test for each scenario/behavior/logic flow that you need to test, not one massive test for everything in getBuzz(). You want each test to have a focused purpose of what you want to verify in getBuzz().
Rather than testing methods try to focus on testing behaviors. Ask the question "what should a widget do?" Then write a test that affirms the answer. Eg. "A widget should fidget"
public void setUp() throws Exception {
myWidget = new Widget();
}
public void testAWidgetShouldFidget() throws Exception {
myWidget.fidget();
}
compile, see "no method fidget defined " errors, fix the errors, recompile the test and repeat. Next ask the question what should the result of each behavior be, in our case what happens as the result of fidget? Maybe there is some observable output like a new 2D coordinate position. In this case our widget would be assumed to be in a given position and when it fidgets it's position is altered some way.
public void setUp() throws Exception {
//Given a widget
myWidget = new Widget();
//And it's original position
Point initialWidgetPosition = widget.position();
}
public void testAWidgetShouldFidget() throws Exception {
myWidget.fidget();
}
public void testAWidgetPositionShouldChangeWhenItFidgets() throws Exception {
myWidget.fidget();
assertNotEquals(initialWidgetPosition, widget.position());
}
Some would argue against both tests exercising the same fidget behavior but it makes sense to single out the behavior of fidget independent of how it impacts widget.position(). If one behavior breaks the single test will pinpoint the cause of failure. Also it is important to state that the behavior can be exercised on its own as a fulfillment of the spec (you do have program specs don't you?) that suggests you need a fidgety widget. In the end it's all about realizing your program specs as code that exercises your interfaces which demonstrate both that you've completed the spec and secondly how one interacts with your product. This is in essence how TDD should work. Any attempt to resolve bugs or test the product usually results in a frustrating pointless debate over which framework to use, level of coverage and how fine grained your suite should be. Each test case should be an exercise of breaking down your spec into a component where you can begin phrasing with Given/When/Then. Given {some application state or precondition} When {a behavior is invoked} Then {assert some observable output}.
I completely second Tomasz Nurkiewicz answer, so I'll say that rather than repeating everything he said.
A couple more points:
Don't forget to test error cases. You can consider something like that:
#Test
public void throwExceptionWhenConditionOneExist() {
// setup
// ...
try {
classUnderTest.doSomething(conditionOne);
Assert.fail("should have thrown exception");
} catch (IllegalArgumentException expected) {
Assert.assertEquals("this is the expected error message", expected.getMessage());
}
}
Also, it has GREAT value to start writing your tests before even thinking about the design of your class under test. If you're a beginner on unit-testing, I cannot emphasize enough learning this technique at the same time (this is called TDD, test-driven development) which proceeds like that:
You think about what user case you have for your user requirements
You write a basic first test for it
You make it compile (by creating needed classes -including your class under test-, etc.)
You run it: it should fail
Now you implement the functionality of the class under test that will make it pass (and nothing more)
Rinse, and repeat with a new requirement
When all your requirements have passing tests, then you're done. You NEVER write anything in your production code that doesn't have a test before (exceptions to that is logging code, and not much more).
TDD is invaluable in producing good quality code, not over-engineering requirements, and making sure you have a 100% functional coverage (rather than line coverage, which is usually meaningless). It requires a change in the way you consider coding, that's why it's valuable to learn the technique at the same time as testing. Once you get it, it will become natural.
Next step is looking into Mocking strategies :)
Have fun testing.
First of all, the setUp and the tearDown Methods will be called before and after each Test, so the setUp Method should create the objects, if you need them in every test, and test-specific things may be done in the test itself.
Second, it is up to you how you want to test your program. Obviously you could write a test for every possible situation in your program and end up with a gazillion tests for every method. Or you could write just one test for every method, which checks every possible scenario. I would recommend a mixture between both ways. You really don't need test for trivial getters/setters, but writing just one test for a method may result in confusion if the test fails. You should decide, which Methods are worth testing, and which scenarios are worth testing. But in principle every scenario should have its own Test.
Mostly I end up with a code coverage of 80 to 90 percent with my tests.
Related
I know how to make sure that a function has been invoked, using:
mockito.verify
now, I want to make sure that on every path of the function (every 'if', 'if else' and 'else') - the function was invoked.
I can basically write unit test for every case, but I want to make sure that if any further cases will be added - there will also be invocation to that method.
Unit testing alone will not do that. You have to look into using coverage in order to get there.
Unit testing can only tell you if the paths that were taken resulted in a "valid" result; but there is no knowledge of "all paths" that exist; and if they were all hit.
So you want to turn here for example and learn which coverage tool would work for you.
When you are working with eclipse or intellij, those things work out of the box; you can install plugins like cobertura or eclemma within eclipse; and then do a "run unit test with coverage".
But of course: that only results in a number. You then have to look carefully at your code to understand if you are happy with that number (where those IDEs make that really easy; they can show you your source code, and which paths were taken).
Meaning: coverage is a whole concept, and you have to understand what that means; and in which way you can make that concept helpful for your daily work. For example, you the last thing you want is your boss giving you a specific target goal for coverage.
And just to be sure: there is no tooling that tells you: you added new code, and now this specific method invocation is no longer coming through all parts. What coverage gives you is that you had 75.32% coverage before your change; and afterwards, it went down to 74.01% ... the rest is then up to you.
now, I want to make sure that on every path of the function (every 'if', 'if else' and 'else') - the function was invoked.
You don't want that.
The misunderstanding is that you don't test "the code". You test public observable behavior. In your case the behavior is that your unit under test (UuT) (after doing other stuff) calls a method on a dependency (I hope).
You don't want to test "the code" because it may change for becoming cleaner and/or support more behavior. But then you don't want to change your existing tests since they will guarantee that the desired behavior is preserved during your refactoring.
On the other hand each test method should verify exactly one expectation about the UuTs behavior. This means that you should already have one test method for each execution path though your if/else cascade. So all you have to do is to add the verify() instruction to each of this test methods.
Finally you may have an easier job testing your code if you force Single Layer of Abstraction principle which basically says that a method either calls methods on some dependencies (aka "dispatching"), calls internal methods or does low level operations. This principle may lead to a design, where the "low level" stuff your UuT currently does moves to a new dependency so that your UuT only needs to do two calls on some dependencies in a certain order...
I have encountered this issue before where I needed to tightly test bound the switch cases, and I was desperate enough to do that.
I am asuming test coverage analysis is not enough for you. For me, the if-else conditions part was so critical that changing something unintentionally could have proved greatly disastrous, so I could not afford to leave a failure prone code and I needed a test case to satisfy myself.
How I satisfied myself is here:
1: Changed the conditions such as if..else, etc to a swich case variable - TaskSwitcherEnum, say taskSwitcher, and performed all sorts of operations under various possible values of TaskSwitcherEnum.
switch (taskSwitcher){
case TaskSwitcherEnum.Task_Type_1:
//do Something before break
break;
case TaskSwitcherEnum.Task_Type_2:
//do Something again
break;
...
}
2: Tightly tested the desired method for all possible values of taskSwitcherEnum. Mockito.verify(), whether required task method is called once, for each given TaskSwitcherEnum value.
3: Finally did a junit like this:
assertEquals("Task performance strategy is designed to handle only five cases.", 5, taskSwitcherEnum.values().length);
Doing this made sure [at least test covered] following things:
1: That my code has only desired branches, and any other code branch addition/deletion is caught by a test case.
2: That each branch does it's desired job by calling a method that I want, through testing on every particular Enum values against a called method.
Gist of the whole answer is, sometimes a little design change helps a lot.
I am starting to write JUnit test cases for a legacy codebase. One of the public methods has multiple if statements and based on a condition it is calling different private methods.Should I write just one test method and test for all the conditions? or one method for each condition?
Wouldn't I lose consistency if I write individual methods for each if condition?
What is the approach to test private methods? Private method logic could be more complicated than public methods.
Base the number of methods on the number of scenarios you want to test, it has nothing to do with the methods that the thing being tested has.
If each scenario takes its own code to set up, then you will get one test method for each scenario. If you can parameterize the tests then you may be able to have one test method and pass in different data for each scenario.
The important thing is that for each combination of inputs you want the test to succeed or fail independently of the other tests. If you shoehorn all the tests into one method then that can't happen, the first test failure will prevent the remaining tests from running.
I agree with Nathan. Tests should go by scenarios not methods. Sometimes legacy code is written in a way that you need to test private methods directly though. And yes, the code should be refactored. But if you can't refactor or want a test in place first...
Option 1 - make methods package private access
This is a very safe refactoring.
Option 2 - use reflection to call the static method directly
If you REALLY can't touch the code, this is the best you can do. I'd question the requirement to not touch the code. If we can't improve the code, should we leave it in the corner to rot?
From my point of view having smaller units in unit testing normally results in better tests.
For you this would mean to change the private methods to package private and to write tests for each of them.
Assume the following setup:
interface Entity {}
interface Context {
Result add(Entity entity);
}
interface Result {
Context newContext();
SpecificResult specificResult();
}
class Runner {
SpecificResult actOn(Entity entity, Context context) {
return context.add(entity).specificResult();
}
}
I want to see that the actOn method simply adds the entity to the context and returns the specificResult. The way I'm testing this right now is the following (using Mockito)
#Test
public void testActOn() {
Entity entity = mock(Entity.class);
Context context = mock(Context.class);
Result result = mock(Result.class);
SpecificResult specificResult = mock(SpecificResult.class);
when(context.add(entity)).thenReturn(result);
when(result.specificResult()).thenReturn(specificResult);
Assert.assertTrue(new Runner().actOn(entity,context) == specificResult);
}
However this seems horribly white box, with mocks returning mocks. What am I doing wrong, and does anybody have a good "best practices" text they can point me to?
Since people requested more context, the original problem is an abstraction of a DFS, in which the Context collects the graph elements and calculates results, which are collated and returned. The actOn is actually the action at the leaves.
It depends of what and how much you want your code to be tested. As you mentionned the tdd tag, I suppose you wrote your test contracts before any actual production code.
So in your contract what do you want to test on the actOn method:
That it returns a SpecificResult given both a Context and an Entity
That add(), specificResult() interactions happen on respectively the Context and the Entity
That the SpecificResult is the same instance returned by the Result
etc.
Depending on what you want to be tested you will write the corresponding tests. You might want to consider relaxing your testing approach if this section of code is not critical. And the opposite if this section can trigger the end of the world as we know it.
Generally speaking whitebox tests are brittle, usually verbose and not expressive, and difficult to refactor. But they are well suited for critical sections that are not supposed to change a lot and by neophytes.
In your case having a mock that returns a mock does look like a whitebox test. But then again if you want to ensure this behavior in the production code this is ok.
Mockito can help you with deep stubs.
Context context = mock(Context.class, RETURNS_DEEP_STUBS);
given(context.add(any(Entity.class)).specificResult()).willReturn(someSpecificResult);
But don't get used to it as it is usually considered bad practice and a test smell.
Other remarks :
Your test method name is not precise enough testActOn does tell the reader what behavior your are testing. Usually tdd practitioners replace the name of the method by a contract sentence like returns_a_SpecificResult_given_both_a_Context_and_an_Entity which is clearly more readable and give the practitioner the scope of what is being tested.
You are creating mock instances in the test with Mockito.mock() syntax, if you have several tests like that I would recommend you to use a MockitoJUnitRunner with the #Mock annotations, this will unclutter a bit your code, and allow the reader to better see what's going on in this particular test.
Use the BDD (Behavior Driven Dev) or the AAA (Arrange Act Assert) approach.
For example:
#Test public void invoke_add_then_specificResult_on_call_actOn() {
// given
... prepare the stubs, the object values here
// when
... call your production code
// then
... assertions and verifications there
}
All in all, as Eric Evans told me Context is king, you shall take decisions with this context in mind. But you really should stick to best practice as much as possible.
There's many reading on test here and there, Martin Fowler has very good articles on this matter, James Carr compiled a list of test anti-patterns, there's also many reading on using well the mocks (for example the don't mock types you don't own mojo), Nat Pryce is the co-author of Growing Object Oriented Software Guided by Tests which is in my opinion a must read, plus you have google ;)
Consider using fakes instead of mocks. It's not really clear what the classes in question are meant to to, but if you can build a simple in-memory (not thread-safe, not persistent etc) implementation of both interfaces, you can use that for flexible testing without the brittleness that sometimes comes from mocking.
I like to use names beginning mock for all my mock objects. Also, I would replace
when(result.specificResult()).thenReturn(specificResult);
Assert.assertTrue(new Runner().actOn(entity,context) == specificResult);
with
Runner toTest = new Runner();
toTest.actOn( mockEntity, mockContext );
verify( mockResult ).specificResult();
because all you're trying to assert is that specificResult() gets run on the right mock object. Whereas your original assert doesn't make it quite so clear what is being asserted. So you don't actually need a mock for SpecificResult. That cuts you down to just one when call, which seems to me to be about right for this kind of test.
But yes, this does seem frightfully white box. Is Runner a public class, or some hidden implementation detail of a higher level process? If it's the latter, then you probably want to write tests around the behaviour at the higher level; rather than probing implementation details.
Not knowing much about the context of the code, I would suggest that Context and Result are likely simple data objects with very little behavior. You could use a Fake as suggested in another answer or, if you have access to the implementations of those interfaces and construction is simple, I'd just use the real objects in lieu of Fakes or Mocks.
Although the context would provide more information, I don't see any problems with your testing methodology myself. The whole point of mock objects is to verify calling behavior without having to instantiate the implementations. Creating stub objects or using actual implementing classes just seems unnecessary to me.
However this seems horribly white box, with mocks returning mocks.
This may be more about the class design than the testing. If that is the way the Runner class works with the external interfaces then I don't see any problem with having the test simulate that behavior.
First off, since nobody's mentioned it, Mockito supports chaining so you can just do:
when(context.add(entity).specificResult()).thenReturn(specificResult);
(and see Brice's comment for how to do enable this; sorry I missed it out!)
Secondly, it comes with a warning saying "Don't do this except for legacy code." You're right about the mock-returning-mock being a bit strange. It's OK to do white-box mocking generally because you're really saying, "My class ought to collaborate with a helper like <this>", but in this case it's collaborating across two different classes, coupling them together.
It's not clear why the Runner needs to get the SpecificResult, as opposed to whatever other result comes out of context.add(entity), so I'm going to make a guess: the Result contains a result with some messages or other information and you just want to know whether it's a success or failure.
That's like me saying, "Don't tell me all about my shopping order, just tell me that I made it successfully!" The Runner shouldn't know that you only want that specific result; it should just return everything that came out, the same way that Amazon shows you your total, postage and all the things you bought, even if you've shopped there lots and are perfectly aware of what you're getting.
If some classes regularly use your Runner just to get a specific result while others require more feedback then I'd make two methods to do it, maybe called something like add and addWithFeedback, the same way that Amazon let you do one-click shopping by a different route.
However, be pragmatic. If it's readable the way you've done it and everyone understands it, use Mockito to chain them and call it a day. You can change it later if you have need.
Recently a new concept of Theories was added to JUnit (since v4.4).
In a nutshell, you can mark your test method with #Theory annotation (instead of #Test), make your test method parametrized and declare an array of parameters, marked with #DataPoints annotation somewhere in the same class.
JUnit will sequentially run your parametrized test method passing parameters retrieved from #DataPoints one after another. But only until the first such invocation fails (due to any reason).
The concept seems to be very similar to #DataProviders from TestNG, but when we use data providers, all the scenarios are run inspite of their execution results. And it's useful because you can see how many scenarious work/don't work and you can fix your program more effectively.
So, I wonder what's the reason not to execute #Theory-marked method for every #DataPoint? (It appears not so difficult to inherit from Theories runner and make a custom runner which will ignore failures but why don't we have such behaviour out of the box?)
UPD: I have created a fault-tolerant version of Theories runner and made it available for a public access: https://github.com/rgorodischer/fault-tolerant-theories
In order to compare it with the standard Theories runner run StandardTheoriesBehaviorDemo then FaultTolerantTheoriesBehaviorDemo which are placed under src/test/... folder.
Reporting multiple failures in a single test is generally a sign that
the test does too much, compared to what a unit test ought to do.
Usually this means either that the test is really a
functional/acceptance/customer test or, if it is a unit test, then it
is too big a unit test.
JUnit is designed to work best with a number of small tests. It
executes each test within a separate instance of the test class. It
reports failure on each test. Shared setup code is most natural when
sharing between tests. This is a design decision that permeates JUnit,
and when you decide to report multiple failures per test, you begin to
fight against JUnit. This is not recommended.
Long tests are a design smell and indicate the likelihood of a design
problem. Kent Beck is fond of saying in this case that "there is an
opportunity to learn something about your design." We would like to
see a pattern language develop around these problems, but it has not
yet been written down.
Source: http://junit.sourceforge.net/doc/faq/faq.htm#tests_12
To ignore assertion failures you can also use a JUnit error collector rule:
The ErrorCollector rule allows execution of a test to continue after
the first problem is found (for example, to collect all the incorrect
rows in a table, and report them all at once)
For example you can write a test like this.
public static class UsesErrorCollectorTwice {
#Rule
public ErrorCollector collector= new ErrorCollector();
#Test
public void example() {
String x = [..]
collector.checkThat(x, not(containsString("a")));
collector.checkThat(y, containsString("b"));
}
}
The error collector uses hamcrest Matchers. Depending on your preferences this is positive or not.
AFAIK, the idea is the same as with asserts, the first failure stops the test. This is the difference between Parameterized & Theories.
Parameterized takes a set of data points and runs a set of test methods with each of them. Theories does the same, but fails when the first assert fails.
Try looking at Parameterized. Maybe it provides what you want.
A Theory is wrong if a single test in it is wrong, according to the definition of a Theory. If your test cases don't follow this rule, it would be wrong to call them a "Theory".
I have a problematic situation with some quite advanced unit tests (using PowerMock for mocking and JUnit 4.5). Without going into too much detail, the first test case of a test class will always succeed, but any following test cases in the same test class fails. However, if I select to only run test case 5 out of 10, for example, it will pass. So all tests pass when being run individually. Is there any way to force JUnit to run one test case at a time? I call JUnit from an ant-script.
I am aware of the problem of dependant test cases, but I can't pinpoint why this is. There are no saved variables across the test cases, so nothing to do at #Before annotation. That's why I'm looking for an emergency solution like forcing JUnit to run tests individually.
I am aware of all the recommendations, but to finally answer your question here is a simple way to achieve what you want. Just put this code inside your test case:
Lock sequential = new ReentrantLock();
#Override
protected void setUp() throws Exception {
super.setUp();
sequential.lock();
}
#Override
protected void tearDown() throws Exception {
sequential.unlock();
super.tearDown();
}
With this, no test can start until the lock is acquired, and only one lock can be acquired at a time.
It seems that your test cases are dependent, that is: the execution of case-X affects the execution of case-Y. Such a testing system should be avoided (for instance: there's no guarantee on the order at which JUnit will run your cases).
You should refactor your cases to make them independent of each other. Many times the use of #Before and #After methods can help you untangle such dependencies.
Your problem is not that JUnit runs all the tests at once, you problem is that you don't see why a test fails. Solutions:
Add more asserts to the tests to make sure that every variable actually contains what you think
Download an IDE from the Internet and use the built-in debugger to look at the various variables
Dump the state of your objects just before the point where the test fails.
Use the "message" part of the asserts to output more information why it fails (see below)
Disable all but a handful of tests (in JUnit 3: replace all strings "void test" with "void dtest" in your source; in JUnit 4: Replace "#Test" with "//D#TEST").
Example:
assertEquals(list.toString(), 5, list.size());
Congratulations. You have found a bug. ;-)
If the tests "shouldn't" effect each other, then you may have uncovered a situation where your code can enter a broken state. Try adding asserts and logging to figure out where the code goes wrong. You may even need to run the tests in a debugger and check your code's internal values after the first test.
Excuse me if I dont answer your question directly, but isn't your problem exactly what TestCase.setUp() and TestCase.tearDown() are supposed to solve? These are methods that the JUnit framework will always call before and after each test case, and are typically used to ensure you begin each test case in the same state.
See also the JavaDoc for TestCase.
You should check your whole codebase that there are no static variables which refer to mutable state. Ideally the program should have no static mutable state (or at least they should be documented like I did here). Also you should be very careful about cleaning up what you write, if the tests write to the file system or database. Otherwise running the tests may leak some side-effects, which makes it hard to make the tests independent and repeatable.
Maven and Ant contain a "forkmode" parameter for running JUnit tests, which specifies whether each test class gets its own JVM or all tests are run in the same JVM. But they do not have an option for running each test method in its own JVM.
I am aware of the problem of dependant
test cases, but I can't pinpoint why
this is. There are no saved variables
across the test cases, so nothing to
do at #Before annotation. That's why
I'm looking for an emergency solution
like forcing JUnit to run tests
individually.
The #Before statement is harmless, because it is called for every test case. The #BeforeClass is dangerous, because it has to be static.
It sounds to me that perhaps it isn't that you are not setting up or tearing down your tests properly (although additional setup/teardown may be part of the solution), but that perhaps you have shared state in your code that you are not aware of. If an early test is setting a static / singleton / shared variable that you are unaware of, the later tests will fail if they are not expecting this. Even with Mocks this is very possible. You need to find this cause. I agree with the other answers in that your tests have exposed a problem that should not be solved by trying to run the tests differently.
Your description shows me, that your unit tests depend each other. That is strongly not recommended in unit tests.
Unit test must be independent and isolated. You have to be able to execute them alone, all of them (in which order, it does not matter).
I know, that does not help you. The problem will be in your #BeforeClass or #Before statements. There will be dependencies. So refactor them and try to isolate the problem.
Probably your mocks are created in your #BeforeClass. Consider to put it into the #Before statement. So there's no instance that last longer than a test case.