For a project I am working on(Spring/struts 2/hibernate), we decided to use h2 for unit testing with MySQL for the production store and manage the scheme in liquibase, pretty standard fare, but the issue we keep on running into is that h2 and MySQL differ in a lot of ways, for example how they handle timestamps and triggers. It's getting to the point that I am starting to regret using h2 as the extra headaches the mis-matches are causing are starting to outweigh its benefits. My question is this, is there any other in-memory/local file database that behaves more like MySQL? Obviously for integration testing we will still use MySQL, but being able to do unit testing without either making the liquibase files into a giant hack or having to ensure the local MySQL db is running would be nice.
I don't think there is another in-memory Java database that is more compatible to MySQL than H2. If you have a lot of code that only works with MySQL, then you should probably also use MySQL for testing.
Just be aware that it will be difficult in the future to switch to another database. Relying too much on features of one product will result in a "vendor lock in". In case of MySQL at least you have the option to switch to MariaDB, so it's not all that bad.
You may use a ram drive, copy your testing tables and datas into that drive, and start your mysql configured to load from that drive, all that in a script at boot time.
Then your unit tests will run insanely faster. We used it for developpers workstations and the level of frustation went three steps down.
I think that as of right now the correct approach is to use MySQL as a Docker image.
Once you create the image you can easily spin it up from your tests, and it's going to take seconds. Your hibernate will dynamically initialize DB schema and there you go!
The only issue is that CI servers need to have Docker installed.
Related
I am testing this api that creates databases/tables in postgres. For automated testing, i was thinking along the lines of, having a setup method that creates a database with tables setup and populated with required data (1000 entries/rows).
I was thinking of an elegant way of doing this? Any thoughts apart from writing code that loops over 1000 times and writing data stored in a csv to postgres table?
Honestly CSV, XML, or any other structured format seems fine to me. Is there a reason you don't want to do that? Using the pg_dump command to export data from an existing DB and using pg_restore could be a good option too.
Your other idea about writing code to generate the data isn't bad either. The benefit of writing code is that your test isn't coupled to a data file.
Also, I would take a look at the H2 database because it has PostgreSQL compatibility mode, and you can actually embed it in your unit/integration tests instead of relying on a PostgreSQL server to be set up and configured in your tests. We've used H2 to test our PostgreSQL app, and it's worked well. The downside is you can't be 100% sure that just because your test passes against H2 that it will against PostgreSQL.
If you really prefer to use postgresql (instead of H2) for testing you could use liquibase. It's a database schema management tool that supports (among others) bulk loading data from csv (http://www.liquibase.org/documentation/changes/load_data.html).
It also offers spring integration if your application use it.
I am developing a very large scale J2EE application and we chose to use Derby as an embedded database for junit testing since hitting the actual prod database will slow down our tests. When I bootstrap my application, the Derby DB will create all the tables so I can run JDBC queries against it. It works fine but the drawback is I cannot actually query any of the tables except through JDBC calls at runtime, so if I need to make changes to my queries, I need to stop the app, modiify my query statements, then restart the application and run in debug. This process makes it very difficult when it comes to analyzing complex queries. Does anyone know of some kind of Derby plugin that can help me to query the DB without doing it through my java code?
If you are using Maven for your build, you can use the derby-maven-plugin, which I wrote and is available on GitHub and via Maven Central. It will take care of starting and stopping the database for you before your tests. You will need to populate this database, yourself of course. You will also have the database in your target/derby folder after the tests execute, so you can always query the data yourself afterwards. This will help you work in a separate development environment which doesn't affect the production database.
You can check here for my answer to a similar question.
This question is extracted from a comment I posted here:
What's the best strategy for unit-testing database-driven applications?
So I have a huge database schema for a legacy application (with quite an old code base) that has many tables, synonyms, triggers, and dblinks. We and we have (finally) started to test some part of the application.
Our tests are already using mocks, but in order to test the queries that we are using we have decided to use an in-memory db with short-lived test dataset.
But the setup of the in-memory database requires a specific SQL script for the db schema setup. The script is not the real DDL we have in production because we can not import it directly.
To make things harder, the database contains functions and procedures that needs to be implemented in Java (we use the h2 db, and that is the way to declare procedures).
I'm afraid that our test won't break the day the real db will change and we will spot the problem only at runtime, potentially in production.
I know that our tests are quite at the border between integration and unit. However with the current architecture it is quite hard to insulate the test from the db. And we want to have proper tests for the db queries (no ORM inside).
What would be solution to have a DDL as close as possible of the real one and without the need to manually maintain it ?
If your environments are Dockerized I would highly suggest checking out Testcontainers (https://www.testcontainers.org/modules/databases/). We have used it to replace in-memory databases in our tests with database instances created from production DDL scripts.
Additionally, you can use tmpfs mounting to get performance levels similar to in-memory databases. This is nicely explained in following post from Vlad Mihalcea: https://vladmihalcea.com/how-to-run-integration-tests-at-warp-speed-with-docker-and-tmpfs/.
This combination works great for our purposes (especially when combined with Hibernate auto-ddl option) and I recommend that you check it out.
I am developping an application that tests different WebServices, and I want it to be as generic as possible. I need to populate database to do JUnit tests, but I don't want these changes to be commited.
I know that some in-memory databases like HSQL DB allow testing on a sort of a virtual (or mock) database, but unfortunately I use oracle and I cannot change it now because of my complex data tables structure.
What is the best practice you suggest?
Thanks.
First of all, HSQL and Hibernate aren't related in any way. The question is whether you can find an embedded database which supports the same SQL as your production database (or rather the subset of SQL which your application uses).
A good candidate for this is H2 database since it emulates a lot of different SQL flavours.
On top of that: Don't test the database. Assume that the database is tested thoroughly by your vendor and just works.
In my code, I aim for:
Save and load each entity.
Generate the SQL for all the queries that I use and compare them against String literals in tests (i.e. I don't run the queries against the database all the time).
Some tests look for a System property. If it's set, then they will run the queries against the database. This happens during the night on my CI server.
The rationale for this: As long as the DB schema doesn't change, there is no point to actually run the queries. That means running them during the day while I sit in front of the computer is a huge waste of time.
To make sure that "low impact" changes don't slip through the gaps, I let a computer run them when I don't care.
Along the same lines, I have mocks for many DAOs which return various predefined results, so I don't have to query the database. The rationale here is that I want to test the processing of results from the database, not the JDBC API, the DB driver, the OS's TCP/IP stack, the network hardware (and software), or any other of the 1000 things between my code and the database records on a harddisk somewhere.
More details in my blog: http://blog.pdark.de/2008/07/26/testing-with-databases/
Our development databases (Oracle 9i) use a remote database link to a remote shared database.
This decision was made years ago when it wasn't practical to put some of the database schemas on a development machine - they were too big.
We have certain schemas on the development machines and we make the remote schemas look local by using Oracle's database links, together with some synonyms on the development machines.
The problem I have is that I would like to test a piece of SQL which joins tables in schemas on either side of the database link.
e.g. (a simplified case):
select a.col, b.col
from a, b
where a.b_id = b.id
a is on the local database
b is on the remove database
I have a synonymn on the locale DB so that 'b' actually points at b#remotedb.
Running the query takes ages in the development environment because of the link. The queries run fine in production (I don't think the Oracle cost based optimiser can cope very well with database links).
We have not been very good at writing unit tests for these types of queries in the past - probably due to the due to the poor performance - so I'd like to start creating some tests for them.
Does anyone have any strategies for writing a unit test for such a query, so as to avoid the performance problems of using the database link?
I'd normally be looking at ways of trying to mock out remote service, but since all this is in a SQL query, I can't see anyway of easily mocking out the remove database.
You should create exact copies of all the schema you need from production on development but without all the data. You should populate the schema with enough data so you can do a proper test. You can also manipulate the optimizer to behave on the test system to be like production by exporting the statistics from the production server and importing them to the development database for the schemas you are duplicating. That way the query will run with the data set you've made but the query will optimize with plans that is similar to that of production. Then you can estimate theoretically how it will scale on production.
Copy the relevant data into your development database and create the tables locally.
Ideally, just build a test case which tells you:
The SQL is correct (it parses)
It operates correctly with a few rows of test data
Don't fall for the "let's copy everything" because that means you'll have no idea what you're testing anymore (and what you're missing).
If in doubt, create a table b with just a single record. If you get an error in this area, add more rows as you learn where it can fail.
If you want to take this to the edge, create the test table (with all data) in a unit test. This way, you can document the test data you're using.
[EDIT] What you need is a test database. Don't run tests against a database which can change. Ideally, the tests should tear down the whole database and recreate it from scratch (tables, indexes, data, everything) as the first step.
In this test database, only keep well defined test data that only changes by defining new tests (and not by someone "just doing something"). If you can, try to run your tests against an in-memory database.
I would suggest materialized views. These are views that store remote data locally.
In theory to do the unit-testing you can work with any set of controlled data created and designed based on your test-cases. It doesn't have to be your live or development system. That's assuming your unit is portable enough. You would test it with your current databases/application when you come to integration testing, which might as well be on the live system anyway (so no db links will be required - I understand your live databases are in one place).
What I'm trying to say, is that you can/should test your unit (i.e. your component, query or whatever you define as a unit) on a controlled set of data that would simulate different 'use cases' and once you complete your testing to satisfactory results, then you can proceed to integration + running integration tests.
Integration tests - you could run this in the live environment, but only after you've proved by unit-testing that your component is 'bullet-proof' (if that's OK with your company's approach/philosophy :) - sys admin's reaction:"Are you flippin creazy?!")
If you are trying to go back in time and test already implemented units, then why bother? If they've been in a production use for some time without any incidents then I would argue that they're OK. However, there's always a chance that your unit/query might have some 'slowly ticking time bomb' effect on the side (cumulative effect over time). Well, analyse the impact is the answer.