Generating test data for unit test cases for nested objects

Generating test data for unit test cases for nested objects - java

When mocking dependent services for writing unit test cases for any enterprise-grade java service, I find setting up the data for the unit test cases a huge pain. Most of the times, this is the single most compelling reason for developers to not write unit test cases and rather write integration style test cases. If the service is dependent on couple of other services (which depend on their respective DAO's) and a DAO of its own, generating the when-thenReturn clauses for a reasonably nested object becomes quite an effort and developers are seen to be taking the easy route and loading the entire spring context and sourcing their data from the direct sources which may not always give the data that can traverse all required code paths. With this in the background, a colleague of mine suggested that why not run a sample integration test, and using aspects, capture all of the relevant data points and serialize it to an XML representation which may be used for materializing test data for the unit test cases. To our pleasant surprise we found a framework called TestDataCaptureJ on github which was very similar to this. It used aspects to capture the data points and it generated the java code to create the objects.
The motivation stated on the site seemed very apt and I was wondering if there are any other alternatives that can give similar features. Also, it would be great if the experts can critique this overall approach.
Also, the project is about 2 yrs old and has a few bugs which we we had to fix and are hoping to give it back as a mavenized github fork. Just checking to ensure that there is no other similar initiative from one of the well-known stables as well.
Thanks in advance!

I have two critiques to that approach... and please bear in mind that my knowledge of your context is almost nil, which means that what I suggest here might not work for you.
I've only once experienced a problem like the one you mentioned, and it was a symptom that there was too much coupling between the objects because the responsbilities were way to broad. Since then I use a Domain-Driven Design approach and I haven't had this problem again.
I prefer to use Test-Data Builders to create test data. This approach allows me to have a template of what I want to build, and just replace the bits I'm interested in the test. If you decide to go this way, I strongly suggest you to use a tiny library called Make-It-Easy that simplifies the creation of these builders.
And two suggestions
If you have some time, I suggest you to
Watch a presetation called The Deep Synergy Between Testability and Good Design by Michael Feathers - Part of the talk is about something very similar to what you're experiencing.
Read the book Growing Object-Orieted Systems, Guided by Tests (aka GOOS), it has all sorts of insights about how to write simple, amazing, testable code.

Related

Should anyone use unit test library methods and classes in live development environment?

This questions looks weird or may be pointless at all.
Recently, I was reviewing some code in java language where the developer used one of the methods from a unit testing library "org.easytesting".
Example: He was using a method "Strings.isNullOrEmpty" of "Strings" class from this library to verify the non-nullability of some values and was using other classes/methods at other places in the code.
I know a library is developed to make our life easier(basic principles of Java) and can be used anywhere/everywhere, but is there a recommendation about using a unit test library in live development code ?
I know using it won't led to a compatibility issue because unit test cases are executed always.
I searched at many places may be I'm missing a good word to search.

It could be argued that a unit-test library is just a library, but I don't see it like this.
First, the purpose of a unit-test library is to be used in code that is not production code. Which means, that certain quality criteria relevant for production code might not be met. For example, if there is a bug in a unit-test library it is annoying, but normally does not harm the production code. Or, performance may not be quite as relevant, thread safety and so on. I don't want to say that the popular unit-testing frameworks are of a bad quality. But, the developers of these libraries have all the right in the world to take design decisions based on the assumption that their code will not be part of production code.
Secondly, using a library should be done in accordance to the philosophy of the respective library. For example, if you use a certain gui library, this brings implications on the way event handling is done in your application. And, unit-testing frameworks come under the assumption that the framework is in control of the executable (from the test runner). Therefore, all functions from that library may depend on the test runner being set up and running. If some function from the library does not have this dependency, that is an implementation detail which may change with a new version of the library.
Finally, code should communicate intent. That includes includes (pun intended). It was not the intent of the developer to write unit-testing code, but that would be what the inclusion of a unit-testing library would communicate.

Considering that there are other, production-oriented libraries out there which check if a string is empty or null, any use of the testing framework's method should be treated as a strong code smell and identified in code reviews.
In the future, this testing library may introduce a change in other parts which make running it in production either prohibitively expensive or insecure, as the code running through this empty or null check could be leveraged as an area of attack. Worse, if your team wants to pivot away from this testing framework, you now have to change production code which many teams would be reluctant to do if all they're doing is changing test dependencies.

Without looking specifically at test libraries, here's an answer to this more general question:
Should you use the general-programming utility classes that are provided by any framework or library? (For example, should you use the StringUtils/CollectionUtils/etc provided by a web/UI/logging/ORM/... framework).
The arguments by the other answers are still mostly valid even in this more general case. Here are some more:
These utilities have been developed specifically for use by the framework. They likely only contain very specific methods with narrow use cases (those that are actually required by the framework) and nothing more. They might be optimized for specific internal behavior of the framework and not for general purposes.
Framework developers may introduce breaking changes without much thought, since they don't expect many users outside of their framework.
It would be alarming to see imports from e.g. a UI library in your back end code, it looks like code smell.
In modular projects, you wouldn't want to introduce additional dependencies to the framework's utilities (again, a dependency to an UI framework from you back end modules is code smell). It would also add a bunch of unnecessary transitive dependencies that may aren't even compatible with other dependencies.
So I would say generally you shouldn't use the utilities of frameworks, except in those places where you are actually working with those frameworks. But even then you should consider using Apache Commons or Guava etc. for consistency.
Now you could also replace the terms UI and back end with test and production in the last two points. Test frameworks are also special in the sense that you usually don't include them as run-time dependency. So you would need to change the scope of the test dependencies to make them available at run-time. You should avoid this for the reasons given in the last point.

Why tighly coupled code is hard to unit test?

I have heard that tighly coupled code is hard to unit test. I dont understand how? Can somebody explain with example.

Tight coupling means that you use implementations instead of interfaces, reducing the array of options when it comes to creating mock implementations and other testing utilities. It may helped by using mocking frameworks (like Mockito for Android) but should nonetheless be avoided, as it is a bad practice.
However, this is probably the least problematic aspect of highly coupled code. It is generally discouraged, because it limits refactoring and/or expanding possibilities. You should always keep some level of abstraction in your code, to be able to easily implement new modules and change current implementations. But do not overdo it, because programs which have lots on interface-implementation exclusive pairs are very redundant and hard to debug.
In general, you should have a look at some open-source projects and see how those are tested (for Android, check out the Google I/O app for example) and how the testing approach is reflected in the code. It all comes with experience and there is no better way to learn it than by analyzing how pros do it :-)

How to write a unit test framework?

How to write a unit test framework?
Can anyone suggest some good reading?
I wish to work on basic building blocks that we use as programmers, so I am thinking of working on developing a unit test framework for Java.
I don't intend to write a framework that will replace junit;
my intention is to gain some experience by doing a worthy project.

There are several books that describe how to build a unit test framework. One of those is Test-Driven Development: By Example (TDD) by Kent Beck. Another book you might look at is xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros.
Why do you want to build your own unit test framework?
Which ones have you tried and what did you find that was missing?
If (as your comments suggest) your objective is to learn about the factors that go into making a good unit test framework by doing it yourself, then chapters 18-24 (Part II: The xUnit Example) of the TDD book show how it can be done in Python. Adapting that to Java would probably teach you quite a lot about Python, unit testing frameworks and possibly Java too.
It will still be valuable to you to have some experience with some unit test framework so that you can compare what you produce with what others have produced. Who knows, you might have some fundamental insight that they've missed and you may improve things for everyone. (It isn't very likely, I'm sorry to say, but it is possible.)
Note that the TDD people are quite adamant that TDD does not work well with databases. That is a nuisance to me as my work is centred on DBMS development; it means I have to adapt the techniques usually espoused in the literature to accommodate the realities of 'testing whether the DBMS works does mean testing against a DBMS'. I believe that the primary reason for their concern is that setting up a database to a known state takes time, and therefore makes testing slower. I can understand that concern - it is a practical problem.

Basically, it consists of three parts:
preparing set of tests
running tests
making reports
Preparing set of tests means that your framework should collect all tests which you want to run. You can specify these tests (usually classes with test methods which satisfy some convention or marked with certain annotation or implement marker interface) in a separate file (java or xml), or you can find them dynamically (making a search over classpath).
If you choose the dynamic searching, then you'll probably have to use some libraries which can analyse java bytecode. Otherwise you'll have to load all the classes in your classpath, and this a) requires much time and b) will execute all static initializers of loaded classes and can cause unexpected tests results.
Running tests can vary significantly depending on features of your framework. The simplest way is just calling test methods inside a try/catch block, analysing and saving results (you have to analyze 2 situations - when the assertion exception was thrown and when it was not thrown).
Making reports is all about printing analyzed results in xml/html/wiki or whatever else format.

The Cook's Tour is written by Kent Beck (I believe; it's not attributed), and describes the thought process that went into writing JUnit. I would suggest reading it and considering how you might choose an alternate line of development.

Java: Convenient way to refactor the application

We have an agile enterprise application built on JSP and Servlet without any design strategy.
This application was built in early 2002 considering 1000 users. After 2002, we received lots of requests from the marketing partners.
Currently, the application has lots of spaghetti code with lots of Ifs and elses. One class has more than 20,000 lines of code with a huge body of functions without abstraction.
Now, we need to support billions of records,
what we need to do immediately and gradually?
We have to refactor the application?
Which framework, we need to use?
How the usage of the framework will be helpful to the end users?
How to convince the leaders to do the refactoring?
How to gain the faster response time as compare to the current system?

Here is how I would approach this if I had appropriate company resources at my disposal (yeah right):
Get a good QA process going, with automated regression testing set up before making significant changes. I don't care how good you are, you can't put a system like that under unit test and reasonably control for regressions.
Map out interdependencies, see how much an individual class can be tested as a unit.
How do you eat an elephant? One bite at a time. Take a given piece of required functionality (preferably something around the increase load requirements) and refactor the parts of the class or classes that can be worked on in isolation.
Learn how do 3 above by reading Working Effectively with Legacy Code.

Convenient way to refactor the application.
There are no "convenient" or "easy" ways to refactor an existing codebase, especially if the codebase looks like spaghetti.
... what we need to do immediately and gradually?
That's impossible to answer without understanding your system's current architecture.
We have to refactor the application?
On the one hand, the fact that you have a lot of poorly designed / maintained code would suggest that it needs some refactoring work.
However, it is not clear that it will be sufficient. It could be that a complete rewrite would be a better idea ... especially if you need to scale up by many orders of magnitude.
Which framework, we need to use?
Impossible to answer without details for your application.
How the usage of the framework will be helpful to the end users?
It might reduce response times. It might improve reliability. It might allow more online users simultaneously. It might do none of the above.
Using a framework won't magically fix a problem of bad design.
How to convince the leaders to do the refactoring?
You need to convince them that the project is going to give a good return on investment (ROV). You / they also need to consider the alternatives:
what happens if you / they do nothing, or
is a complete rewrite likely to give a better outcome.
How to gain the faster response time as compare to the current system?
Impossible to answer without understanding why the current system is slow.
The bottom line is that you probably need someone from outside your immediate group (e.g. an external consultant) to do a detailed review your current system and report on your options for fixing it. It sounds like your management don't trust your recommendations.

These are big, big questions. Too broad for one answer really.
My best advice is this: start small, if you can. Refactor piece by piece. And most importantly, before touching the code, write automated tests against the current codebase, so you can be relatively sure you haven't broken anything when you do refactor it.
It may not be possible to write these tests, as the code may not be testable in it's current format. But you should still make this one of your main goals.

By definition refactoring shouldn't show any difference to the users. It's only to help developers work on the code. It sounds like you want to do a more extensive rewrite to modernize the application. Moving to something like JSF will make life a lot easier for developers and will give you access to web component libraries to improve the user experience.

It is a question which needs a lengthy answer. To start with I would suggest that the application is tested well and is working as per the specification. This means there are enough unit, integration and functional tests. The functional tests also have to be automated. Once these are in place, a step by step refactoring can take place. Do you have enough tests to start?

Adoption of TDD on older Java application

I've got a problem and I'm asking you for help
I've started working on web application, that has no tests, is based on spring 2.5 and hibernate 3.2, is not very well modularized, with classes having up to 5k lines, as view technology there is JSP used all over the place with quite a lot things duplicated (like many similar search forms with very few differencies but with not many shared parts).
Aplication works well, however, everything is running just fine, but when there is need to add or to change some functionality, it is realy slow and not very convenient.
Is there any possibility to employ TDD at this point? Or what would you recomend as I dont't think I can develop it forever the way it is now, it is just getting messier all the time.
Thanky you for answers.

I would start by picking up a copy of Michael Feathers' book Working Effectively with Legacy Code - this is pure gold.
Once you learn techniques for refactoring and breaking apart your application at it's logical seams, you can work on integrating TDD in newer modules/sprout classes and methods, etc.
Case in point, we recently switched to a TDD approach for a ten year old application written in almost every version of our framework, and while we're still struggling with some pieces, we've made sure that all of our new work is abstracted out, and all of the new code is under test.
So absolutely doable - just a bit more challenging, and the book above can be a tremendous help in getting started.

First, welcome to the club of poor good programmers that have to fix crimes done by their worse colleagues. :(
I had such experience. In this case one of the recommended practices is developing tests for new features. You cannot stop now and develop tests for whole application. What you can do is every time you have to write new feature develop tests for this feature also. If this feature requires changes in some sensitive places start tests for these places.
Refactoring is a big problem. Ideally if you want to separate 5k lines class to 10 normal size classes you should first develop test case(s) for the big class, then perform refatoring and then run tests again to validate that you have not break anything. It is very hard in practice because when you change the design you change the interface and therefore you cannot run exactly the same tests. So, each time you should make the hard decision what is the best way and what are the minimal test case that covers your ass.
For example sometimes I performed 5 phase refatoring:
1. developed tests for bad big class
2. developed new well designed code and changed the old class to be the facade for my new code.
3. ran the test case developed in #1 to validate that everything works
4. developed new tests that verify that each new (small) sub module works well
5. refactred code, i.e. removed all references to the big old class (that became lightweight facade)
5. removed the old class and its tests.
But this is the worse case scenario. I had to use it when code that I am changing is extremely sensitive.
Shortly, good luck in your hard job. Prepare to work overnight and then receive 20 bug reports from QA and angry email from your boss. :( Be strong. You are on the right way!

If you feel like you can't make any changes for fear of breaking stuff, then you have answered your own question: you need to do something.
The first rule of holes is: If you are stuck in a hole, stop digging.
You need to institute a policy such that if code is committed without a test, that is the exception and not the rule. Use continuous integration and force people to keep the build passing.
I recommend starting by capturing the core functionality of the app in tests, both unit and integration. These tests should be a baseline that shows the necessary functionality is working.
You mentioned there is a lot of code duplication. Thats the next place to go. Put a test around an area with duplicate code. You will be testing 2 or more items here, since there is duplication. Then do a refactor and see if the tests still pass.
Once you knock one domino down, the rest will follow.

Yes there is definitely a place for TDD, but it is only a part of the solution.
You need to refactor this application before you can make any changes. Refactoring requires test coverage to be in place. Take small portions of obviously substandard code and write characterisation tests for them. This means you test all the variations possible through that code. You will probably find bugs doing this. Raise the bugs via your QA system and keep the buggy behaviour for now (lock the bugs in with your characterisation tests as other parts of the system might, for now, be relying on the buggy behaviour).
If you have very long and complex methods, you may call upon your IDE to extract small portions to separate methods where appropriate. Then write characterisation tests for those methods. Attack big methods in this way, bit by bit, until they are well-partitioned. Finally, once you have tests in place, you can refactor.
integration tests can be useful in this circumstance to highlight happy-day scenarios or a few major error scenarios. But usually in this circumstance the application is far too complex to write a complete integration test suite. This means you might never be protected 100% against side-effects using integration tests alone. That is why I prefer 'extract method' and characterise.
Now that your application is protected from side-effects, you may add new features using TDD.

My approach would be to start adding tests piece by piece. If there is a section you know you're going to have to update in the near future, start getting some good coverage on that section. Then when you need to update/refactor, you have your regression tests. From the sounds of it, it will be a major undertaking to establish a comprehensive test suite, but it will most likely pay off in the end. I would also suggest using one of the various code coverage tools available to see how much your tests are actually covering.
Cheers!

You probably can't do test driven development at this point, except if you happen to add functionality that is easy to isolate from the rest of the system (which is unlikely).
However, you can (and should) certainly add automated tests of your core functionality. Those are at first not going to be real unit tests in the sense of testing small units of code in isolation, but IMO the importance of those is often overstated. Integration tests may not run as fast or help you pinpoint the cause of bugs as quickly, but they still help tremendously in protecting you against side effects of changes. And that's something you really need when you refactor the code to make future changes easier and real unit tests possible.
In general, go for the low hanging but juicy fruit first: write tests for parts of the code that can be tested easily, or break easily, or cause the most problems when they break (and where tests are thus most valuable), or ideally all of these together. This gives you real value quickly and helps convince reluctant developers (or managers) that this is a path worth pursuing.
A continuous build server is a must. A test suite that people have to remember to run manually to get its benefit means that you're wasting most of its benefit.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.