Purposed Cassandra 2.1 setup for unit tests?

Purposed Cassandra 2.1 setup for unit tests? - java

I currently wonder what options do I have to configure Cassandra for unit testing.
Currently I just use a SSD drive and set the Cassandra directory differently and start the tests loading test scenarios. It is dead slow but I reuse the server and heal the scenarios (restore instead of delete and start over) but beside from that what else can I do?
I also pondered if I can create a ram drive and mount it just for those tests.
What options are useful in conjunction with tests without introducing functional differences that make acceptance tests worthless?
Is there an in-memory replacement like one replaces MySQL/PostgreSQL with H2 for unit testing?

I'd avoid using a real Cassandra instance for your unit testing, it'll make your tests brittle and will mean they won't be able to run anywhere. Theres a couple of options for unit testing your dao's without a real Cassandra needing to be available.
One option is Cassandra Unit. This works by starting up an embedded Cassandra for you to connect to, you can create keyspaces/tables and insert data to prime it just like a real Cassandra.
Another option is Scassandra. This starts up a stubbed Cassandra and needs to be primed with what to return. The great thing about Scassandra is that you can test all of your error scenarios such as timeouts, NoHostAvailable etc.

Related

Best practices or the effective approach to write integration test which run in a continues integration environment

In general, I write integration test from my service/ remoting layer to the database so that I can check the server side layers are integrated and tested, I would like to keep the rollback as false if not we will miss out the database constraint level validation. It is a personal preference.
We can follow different approaches
- Create data for each test case and delete it once executed
- Run with a certain amount of existing common data such as (User)
There may be entities depends on other several entities and to be able to test such flows it requires a lot of effort to create every entity for each test case or class and maybe for a business flow if we make a decision we create a certain amount of data and execute a business flow with a certain number of test and clear the data. These things can consume a lot of time to run such test cases.
Is there an effective approach or best practice that is followed in the industry to write integration test in the continues integration environments. I normally use TestNG as it provides spring support. Is there any Java-based frameworks.

I think it really depends on a project and there is no silver bullet solution here.
There are indeed many approaches as you state, I'll mention a few:
Take advantage of Spring's #Transactional annotation on the test. In this case, spring will execute rollback after each test. so that the data changed by the test won't really be saved in the database even if the test passes.
Do not use #Transactional but organize tests so that they won't interfere (each test will use its own set of data that can co-exist with other tests data). If the test fails and doesn't "clean-up" its stuff, then other tests should still run. In addition, if the tests are being run in parallel, they still should not interfere.
Use new schema for each test (obviously expensive, but still can be a viable option to some projects).
Now, the real question is what do you test.
If you test a java code, like that your SQLs are created correctly, then probably the first way is a way to go.
Of course, it also depends on what commands are being executed during the tests, not in all databases all the commands can be in a transaction (for example in Postgres you can use DDL inside a transaction, in Oracle you can't, and so forth).
Another concern to think about during the continuous testing is the performance of tests.
Integration tests are slow and if you have a monolith application that runs hundreds of them, then the build will be really slow. Managing build that runs hours is a big pain.
I would like to mention here 2 ideas that can help here:
Moving to microservices helps a lot in this case (each microservice runs only a bunch of its tests and hence the build of each microservice on its own is much faster by nature)
Another interesting option to consider is running the tests against a docker container of the database that starts right in the test case (it also can be cached so that not every test will raise a docker container). A big benefit of such an approach is that everything runs locally (on the build server), so no interaction with the remote database (performance) + the clean-up of resources is done automatically, even if some tests fail. The Docker container dies and all the data put by the tets gets cleaned up automatically. Take a look at Testcontainers project maybe you'll find it helpful

Is it bad practice to allow my Junit tests to interact with a real DB?

I'm building a basic HTTP API and some actions like POST /users create a new user record in the database.
I understand that I could mock these calls, but at some level I'm wondering if it's easier to let my Junit tests run against a real (test) database? Is this a bad practice? Should only integration tests run against a real DB?
I'm using flyway to maintain my test schema and maven for my build, so I can have it recreate the test DB with the proper schema on each build. But I'm also worried that I'd need some additional overhead to maintain/clean the state of the database between each test, and I'm not sure if there's a good way to do that.

Unit tests are used to test single unit of code. This means that you write a unit test by writing something that tests a method only. If there are external dependencies then you mock them instead of actually calling and using those dependencies.
So, if you write code and it interacts with the real database, then it is not a unit test. Say, for some reason your call to db fails then unit test will also fail. Success or failure of your unit test should not be dependent on the external dependencies like db in your case. You have to assume that db call is successful and then hard code the data using some mocking framework(Mockito) and then test your method using that data.

As often, it depends.
On big projects with lots of JUnit tests, the overhead for the performance can be a point. Also the work time needed for the setup of the test data within the database as well as the needed concept for your tests not interfering with the test data of other tests while parallel execution of JUnit tests is a very big argument for only testing against a database if needed and otherwise mock it away.
On small projects this problems may be easier to handle so that you can always use a database but I personally wouldn't do that even on small projects.

As several other answers suggest you should create unit tests for testing small pieces of code with mocking all external dependencies.
However sometimes ( a lot of times) it should worth to test whole features. Especially when you use some kind of framework like Spring. Or you use a lot of annotations. When your classes or methods have annotations on them the effects of those annotations usually cannot be tested via unit-tests. You need the whole framework running during the test to make sure it works as expected.
In our current project we have almost as much integration tests as unit tests. We use the H2 in-memory DB for these tests, this way we can avoid failures because of connectivity problems, and Spring's test package could collect and run multiple integration tests into the same test-context, so it has to build the context only once for multiple tests and this way running these tests are not too expensive.
Also you can create separate test context for different part of the project (with different settings and DB content), so this way the tests running under different context won't interfere with each-other.
Do not afraid of using a lot of integration tests. You need some anyway, and if you already have a test-context it's not a big deal adding some more tests into the same context.
Also there are a lot of cases which would take a LOT of effort to cover with unit-tests (or cannot be covered fully at all) but can be covered simply by an integration tests.
A personal experience:
Our numerous integration tests were extremely useful when we switched from Spring Boot to Spring Boot 2.
Back to the original question:
Unit tests should not connect to real DB, but feel free to use more integration tests. (with in-memory DB)

Modern development practices recommend that every developer runs the full suite of unit tests often. Unit tests should be reliable (should not fail if the code is OK) Using an external database can interfere with those desiradata.
If the database is shared, simultaneous runs of the testsuite by different developers could interfere with each other.
Setting up and tearing down the database for each test is typically expensive, and thus can make the tests too slow for frequent execution.
However, using a real database for integration tests is OK. If you use an in-memory database instead of a fully real database, even set up and tear down of the database for each integration test can be acceptably fast.

A popular choice is the use of an in-memory database to run tests. This makes it easy to test, for example, repository methods and business logic involving database calls.
When opting for a "real" database, make sure that every developer has his/her own test database to avoid conflicts. The advantage of using a real database is that this prevents possible issues that could arise because of slight differences in behavior between in-memory and real database. However, test execution performance can be an issue when running a large test suite against a real database.
Some databases can be embedded in a way that the database doesn't even need to be installed for test execution. For example, there is an SO thread about firing up an embedded Postgres in Spring Boot tests.

TDD without local database?

When we develop a Rails application then we use a local database in our development environment, and make sure that our specs pass as part of TDD.
Is it a norm to not use a local database similar to Sqlite while doing TDD in Java? I have been told in-memory database(HSQL) is all that is needed for running unit and integration tests. Is this a standard practice being followed?
We use Sqlite in our Rails application for local development and for running our Rspecs. But my question is for Java development. We are working on rewritting a part of our application in Java. I have been told that you do not need any database for development if you write integration tests covering all functionality. And have been told that HSQL is sufficient for that. As I am used to having database for local development in Rails, I am wondering how you debug any issues later on? It is quite helpful to analyze any issues if we can replicate the data and scenario in local environment. How do you do same in Java/Spring if you do not use any database for development environment and rely completely on HSQL for testing?

For me, I never use any databases including HSQLDB to write an unit-test.
I prefer to create some interfaces like as: *Repository. and let's the SUT communicate with it. and then I write some implementation class let them implement the interface which I have created. and the classes hierarchy looks like below:
<<uses>>
SUT ---------------> Repository
^
| <<implement>>
|
|--------|--------|-------|
| | | |
JPA Hibernate JDBC .etc
this approach is known as Separation of Concerns. the application domain is a concern, data accessing is another concern. following this approach result in many plug-compatible components and independent modules, such as: domain, jpa, jdbc, and .etc, but the important thing is that will make your test is more testable.
Then I use Test Doubles to mock/stub out its collaboration in unit-test to testing them are work together as expected. the pseudo-code like as below:
repo = mock(Repository.class);
SUT it = new SUT(repository);
when(repo.find(id)).thenReturn(entity);
assert it.exercise() == expectedResult;
assert it.currentState == expectedState;
But you must write some integration test using database to testing each Repository implementation that operate on the third-party api. it is called by Martin: Test Isolation.

The answer to your question: is very common to have your test environment database as close as the development environment as possible.
I suppose that you are preoccupied with performance, there are more crucial things that you could improve before considering having an in-memory database.
Usually while TDD-ing you would only run the tests involved and later run your whole suite to check that you didn't break anything. If you are using Rspec you could use tags.
Another important thing is to clean the database at the beginning of every test since tests should be isolated and never depend on the result of previous tests. This will improve complex search queries that you could have in your system. there is a gem that could help you here.
Finally, if you are using some sort of continuous integration tool remember to set it up using rake db:schema:load instead of rake db:migrate. This will run your schema file as a single migration instead of running each single migration every time you commit. (Remember to keep this version-controlled and always up to date)

You are getting terminology wrong. TDD is about writing test cases in general. But most of the time, and also in your question, one thinks about using TDD for unit testing.
And unfortunately, terms are not very clear. When you turn to wikipedia, you find there (my words): "anything you do to test a piece of software" can be called a unit test.
But that isn't helpful. You should rather look for definitions such as here. And the main aspect there: unit tests work in isolation. Quoting from that link:
Runs in memory (no DB or File access, for example)
Thus:
when doing unit testing, you should not use any database
when you integration tests, you want to ensure that your solution works "end to end". In that sense you might be using a special instance of your database, but not a different kind of database.

Elegant way for DAO testing

I want to improve my DB access code tests.
I am using GAE datastore. To test the Db classes, I used a Backdoor Servlet. Just wondering, is there more efficient and elegant way to do DAO testing?
Your views on Unit vs Integration tests for DAO?

It depends a bit on how your database is set up. Here are a couple of other options apart from what you already have:
you can write unit tests directly against your DAOs. You can mock the database calls away with mockito.
you can write unit tests that records the integration with the database and then replays it when you run the tests a second time. See the betamax library fot this.
you can run unit tests against the actual database. Now it is not unit tests anymore but a kind of integration test. In this case you will need to think about how to get a clean state in the database to start from.
you can run integration tests against the entire system and make sure that most of your database code is touched by using a code coverage tool.
I prefer to have full blown integrations tests on the whole thing including the database and any other third party integrations. And unit tests on the particulars but not necessarily involving the actual database calls. But - as always - your setup may lead you in other directions.

Testing SQL query on Oracle which includes a remote database

Our development databases (Oracle 9i) use a remote database link to a remote shared database.
This decision was made years ago when it wasn't practical to put some of the database schemas on a development machine - they were too big.
We have certain schemas on the development machines and we make the remote schemas look local by using Oracle's database links, together with some synonyms on the development machines.
The problem I have is that I would like to test a piece of SQL which joins tables in schemas on either side of the database link.
e.g. (a simplified case):
select a.col, b.col
from a, b
where a.b_id = b.id
a is on the local database
b is on the remove database
I have a synonymn on the locale DB so that 'b' actually points at b#remotedb.
Running the query takes ages in the development environment because of the link. The queries run fine in production (I don't think the Oracle cost based optimiser can cope very well with database links).
We have not been very good at writing unit tests for these types of queries in the past - probably due to the due to the poor performance - so I'd like to start creating some tests for them.
Does anyone have any strategies for writing a unit test for such a query, so as to avoid the performance problems of using the database link?
I'd normally be looking at ways of trying to mock out remote service, but since all this is in a SQL query, I can't see anyway of easily mocking out the remove database.

You should create exact copies of all the schema you need from production on development but without all the data. You should populate the schema with enough data so you can do a proper test. You can also manipulate the optimizer to behave on the test system to be like production by exporting the statistics from the production server and importing them to the development database for the schemas you are duplicating. That way the query will run with the data set you've made but the query will optimize with plans that is similar to that of production. Then you can estimate theoretically how it will scale on production.

Copy the relevant data into your development database and create the tables locally.
Ideally, just build a test case which tells you:
The SQL is correct (it parses)
It operates correctly with a few rows of test data
Don't fall for the "let's copy everything" because that means you'll have no idea what you're testing anymore (and what you're missing).
If in doubt, create a table b with just a single record. If you get an error in this area, add more rows as you learn where it can fail.
If you want to take this to the edge, create the test table (with all data) in a unit test. This way, you can document the test data you're using.
[EDIT] What you need is a test database. Don't run tests against a database which can change. Ideally, the tests should tear down the whole database and recreate it from scratch (tables, indexes, data, everything) as the first step.
In this test database, only keep well defined test data that only changes by defining new tests (and not by someone "just doing something"). If you can, try to run your tests against an in-memory database.

I would suggest materialized views. These are views that store remote data locally.

In theory to do the unit-testing you can work with any set of controlled data created and designed based on your test-cases. It doesn't have to be your live or development system. That's assuming your unit is portable enough. You would test it with your current databases/application when you come to integration testing, which might as well be on the live system anyway (so no db links will be required - I understand your live databases are in one place).
What I'm trying to say, is that you can/should test your unit (i.e. your component, query or whatever you define as a unit) on a controlled set of data that would simulate different 'use cases' and once you complete your testing to satisfactory results, then you can proceed to integration + running integration tests.
Integration tests - you could run this in the live environment, but only after you've proved by unit-testing that your component is 'bullet-proof' (if that's OK with your company's approach/philosophy :) - sys admin's reaction:"Are you flippin creazy?!")
If you are trying to go back in time and test already implemented units, then why bother? If they've been in a production use for some time without any incidents then I would argue that they're OK. However, there's always a chance that your unit/query might have some 'slowly ticking time bomb' effect on the side (cumulative effect over time). Well, analyse the impact is the answer.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.