DB migrations (Liquibase + Maven + Spring)

DB migrations (Liquibase + Maven + Spring) - java

I'm coming from a .Net background cutting my teeth on a Java project that is using Maven, Spring and Liquibase. Needless to say, this is a new bag of concepts and frameworks to deal with.
Tests won't complete:
My tests wont complete successfully because they fail when attempting to access a table within my database. They fail because that table doesn't exist. I see that I have many migration files in a Liquibase XML format within my project, but am looking at how to run them.
liquibase-maven-plugin not an option:
I see that others might use the liquibase-maven-plugin plugin, but in my case, the project does not have that plugin referenced in any the pom.xml, only liquibase-core. There are a handful of other developers that knew what they were doing that worked on this project in the past, given that they never referenced this plugin in the pom.xml file, I assume it was for good reason and I wont be stirring that pot.
SpringLiquibase?
They have a reference to to a bean that looks like this: <bean id="liquibase" class="liquibase.integration.spring.SpringLiquibase">, which after further research appears to do automatic data migrations,
GREAT!
....but how do I go about invoking it? Must my project already pass my Tests and actually be "ran" before this logic gets hit? If that is the case and my project must successfully build / test, then I apparently must run my migrations outside of this SpringLiquibase bean.
Should I be using the liquibase command line and if so, can I safely assume this is what the previous developers were doing to initially establish their database?

You are right that the SpringLiquibase setup should do the database update automatically, but it will only do it when the spring framework is started.
My guess is that your normal application startup fires Liquibase through Spring but the test framework does not. Perhaps they had not noticed it because they would tend to make the database change in the liquibase changelog files, then start the normal application for initial testing (which updated the database) then build and run the tests. Now that you are running the tests first, the database is not yet there.
Are you able to tell if your tests are trying to start Spring?
Even in cases where an application is using SpringLiquibase, I usually recommend configuring your project to allow manual updates using liquibase-maven-plugin, ant plugin, or command line because it tends to make a more efficient process. With that setup, you can add changesets and then run liquibase update without going through an entire application startup or even running your tests. You could set it to automatically run on test execution, but the update process is usually infrequent enough that it is better to avoid the liquibase update overhead on every test execution. It is still very helpful to include in your application's spring setup so that in QA and production you don't have to remember to manually update the database, it is just automatically kept up to date.

Related

Spring Boot with AspectJ and Aspectweaver not persisting singleton when started with IntelliJ

So, I have a really weird problem which did cost me hours to find.
I have a simple Spring Boot application which I start with a normal Spring Boot run configuration in IntelliJ.
I've added a special Aspect configuration (see https://callistaenterprise.se/blogg/teknik/2020/09/20/multi-tenancy-with-spring-boot-part2/) which requires two Java Agents (spring-instrument, aspectweaver).
I've added them separately to the run configuration "VM options" as following:
-javaagent:target/dependency/spring-instrument-5.3.8.jar -javaagent:target/dependency/aspectjweaver-1.9.6.jar
They successfully get loaded on start.
Wherever the Aspect function is called, a singleton that has been already used/instantiated before, will magically get reinstantiated and is then of course empty.
This does not happen / works as expected, when I use mvn spring-boot:run (or just java -jar -javaagent:.... app.jar) to run the application. In this case, the maven configuration in pom.xml takes care to add the needed agents.
What I think is that somehow IntelliJ adds special stuff to the run command or something else interferes with the application, so that the Aspect function is not called in the same context/instance.
It should not be related to threads, since it's prepared for it.
Does anybody have an idea, what's happening here? I'd like to keep the Spring Boot run configuration and thus the Debug option, since debugging does not seem to work out of the box with mvn spring-bootn:run.

Edit and re-run spring boot unit test without reloading context to speed up tests

I have a spring boot app and have written unit tests using a postgres test container (https://www.testcontainers.org/) and JUnit. The tests have the #SpringBootTest annotation which loads the context and starts up a test container before running the test.
Loading the context and starting the container takes around 15sec on my relatively old Macbook, but the test themselves are pretty fast (< 100ms each). So in a full build with 100s of tests, this does not really matter. It is a one time cost of 15sec.
But developing/debugging the tests individually in an IDE becomes very slow. Every single test incurs a 15 sec startup cost.
I know IntelliJ and Springboot support hot reload of classes when the app is running. Are there similar solutions/suggestions for doing the same for unit tests ? i.e Keep the context loaded and the testcontainer(DB) running but recompile just the modified test class and run the selected test again .

There is a simple solution for your issue I believe. You haven't specified how exactly do you run the test container in the test, however I have a successful experience with the following approach:
For tests running locally - start postgres server on your laptop once (say at the beginning of your working day or something). It can be dockerized process or even regular postgresql installation.
During the test spring boot application doesn't really know that it interacts with test container - it gets host/port/credentials and that's it - it creates a DataSource out of these parameters.
So for your local development, you can modify the integration with the test container so that the actual test container will be launched only if there is no "LOCAL.TEST.MODE" env. variable defined (basically you can pick any name - it's not something that exists).
Then, define the ENV variable on your laptop (or you can use system property for that - whatever works for you better) and then configure spring boot's datasource to get the properties of your local installation if that system property is defined:
In a nutshell, it can be something like:
#Configuration
#ConditionalOnProperty(name = "test.local.mode", havingValue = "true", matchIfMissing = false)
public class MyDbConfig {
#Bean
public DataSource dataSource () {
// create a data source initialized with local credentials
}
}
Of course, more "clever" solution with configuration properties can be implemented, it all depends on how do you integrate with test containers and where do the actual properties for the data source initialization come from, but the idea will remain the same:
In your local env. you'll actually work with a locally installed PostgreSQL server and won't even start the test container
Since all the operations in postgresql including DDL are transactional, you can put a #Transactional annotation on the test and spring will roll back all the changes done by the test so that the DB won't be full of garbage data.
As opposed to Test containers, this method has one significant advantage:
If your test fails and some data remains in the database you can check that locally because the server will remain alive. So you'll be able to connect to the db with PG Admin or something and examine the state...
Update 1
Based on op's comment
I see what you say, Basically, you've mentioned two different issues that I'll try to refer to separately
Issue 1 Application Context takes about 10-12 seconds to start.
Ok, this is something that requires investigation. The chances are that there is some bean that gets initialized slowly. So you should understand why exactly does the application starts so slowly:
The code of Spring (scanning, bean definition population, etc) works for particles of a second and usually is not a bottleneck by itself - it must be somewhere in your application.
Checking the beans startup time is kind of out of scope for this question, although there are certainly methods to do so, for example:
see this thread and for newer spring versions and if you use actuator this here. So I'll assume you will figure out one way or another why does it start slowly
Anyway, what you can do with this kind of information, and how you can make the application context loading process faster?
Well, obviously you can exclude the slow bean/set of beans from the configuration, maybe you don't need it at all in the tests or at least can use #MockBean instead - this highly varies depending on the actual use case.
Its also possible to provide configuration in some cases that will still load that slow bean but will alter its behavior so that it won't become slow.
I can also point of "generally applicable ideas" that can help regardless your actual code base.
First of all, if you're running different test cases (multi-select tests in the IDE and run them all at once) that share exactly the same configurations, then spring boot is smart enough to not re-initialize the application context. This is called "caching of the application context in cache". Here is one of the numerous tutorials about this topic.
Another approach is using lazy beans initialization. In spring 2.2+ there is a property for that
spring:
main:
lazy-initialization: true
Of course, if you're not planning to use it in production, define it in src/test/resource's configuration file of your choice. spring-boot will read it as well during the test as long as it adheres to the naming convention. If you have technical issues with this. (again out of scope of the question), then consider reading this tutorial
If your spring boot is older than 2.2 you can try to do that "manually": here is how
The last direction I would like to mention is - reconsidering your test implementation. This is especially relevant if you have a big project to test. Usually, the application has separation onto layers, like services, DAO-s, controllers, you know. My point is that the testing that involves DB should be used only for the DAO's layer - this is where you test your SQL queries.
The Business logic code usually doesn't require DB connection and in general, can be covered in unit tests that do not use spring at all. So instead of using #SpringBootTest annotation that starts the whole application context, you can run only the configuration of DAO(s), the chances that this will start way faster and "slow beans" belong to other parts of the application. Spring boot even has a special annotation for it (they have annotations for everything ;) ) #DataJpaTest.
This is based on the idea that the whole spring testing package is intended for integration tests only, in general, the test where you start spring is the integration test, and you'll probably prefer to work with unit tests wherever possible because they're way faster and do not use external dependencies: databases, remote services, etc.
The second issue: the schema often goes out of sync
In my current approach, the test container starts up, liquibase applies my schema and then the test is executed. Everything gets done from within the IDE, which is a bit more convenient.
I admit I haven't worked with liquibase, we've used flyway instead but I believe the answer will be the same.
In a nutshell - this will keep working like that and you don't need to change anything.
I'll explain.
Liquibase is supposed to start along with spring application context and it should apply the migrations, that's true. But before actually applying the migrations it should check whether the migrations are already applied and if the DB is in-sync it will do nothing. Flyway maintains a table in the DB for that purpose, I'm sure liquibase uses a similar mechanism.
So as long as you're not creating tables or something that test, you should be good to go:
Assuming, you're starting the Postgres server for the first time, the first test you run "at the beginning of your working day", following the aforementioned use-case will create a schema and deploy all the tables, indices, etc. with the help of liquibase migrations, and then will start the test.
However, now when you're starting the second test - the migrations will already be applied. It's equivalent to the restarting of the application itself in a non-test scenario (staging, production whatever) - the restart itself won't really apply all the migration to the DB. The same goes here...
Ok, that's the easy case, but you probably populate the data inside the tests (well, you should be ;) ) That's why I've mentioned that it's necessary to put #Transactional annotation on the test itself in the original answer.
This annotation creates a transaction before running all the code in the test and artificially rolls it back - read, removes all the data populated in the test, despite the fact that the test has passed
Now to make it more complicated, what if you create tables, alter columns on existing tables inside the test? Well, this alone will make your liquibase crazy even for production scenarios, so probably you shouldn't do that, but again putting #Transactional on the test itself helps here, because PostgreSQL's DDLs (just to clarify DDL = Data Definition Language, so I mean commands like ALTER TABLE, basically anything that changes an existing schema) commands also transactional. I know that Oracle for example didn't run DDL commands in a transaction, but things might have changed since then.

I don't think you can keep the context loaded.
What you can do is activate reusable containers feature from testcontainers. It prevents container's destruction after test is ran.
You'll have to make sure, that your tests are idempotent, or that they remove all the changes, made to container, after completion.
In short, you should add .withReuse(true) to your container definition and add testcontainers.reuse.enable=true to ~/.testcontainers.properties (this is a file in your homedir)
Here's how I define my testcontainer to test my code with Oracle.
import org.testcontainers.containers.BindMode;
import org.testcontainers.containers.OracleContainer;
public class StaticOracleContainer {
public static OracleContainer getContainer() {
return LazyOracleContainer.ORACLE_CONTAINER;
}
private static class LazyOracleContainer {
private static final OracleContainer ORACLE_CONTAINER = makeContainer();
private static OracleContainer makeContainer() {
final OracleContainer container = new OracleContainer()
// Username which testcontainers is going to use
// to find out if container is up and running
.withUsername("SYSTEM")
// Password which testcontainers is going to use
// to find out if container is up and running
.withPassword("123")
// Tell testcontainers, that those ports should
// be mapped to external ports
.withExposedPorts(1521, 5500)
// Oracle database is not going to start if less
// than 1gb of shared memory is available, so this is necessary
.withSharedMemorySize(2147483648L)
// This the same as giving the container
// -v /path/to/init_db.sql:/u01/app/oracle/scripts/startup/init_db.sql
// Oracle will execute init_db.sql, after container is started
.withClasspathResourceMapping("init_db.sql"
, "/u01/app/oracle/scripts/startup/init_db.sql"
, BindMode.READ_ONLY)
// Do not destroy container
.withReuse(true)
;
container.start();
return container;
}
}
}
As you can see this is a singleton. I need it to control testcontainers lifecycle manually, so that I could use reusable containers
If you want to know how to use this singleton to add Oracle to Spring test context, you can look at my example of using testcontainers. https://github.com/poxu/testcontainers-spring-demo
There's one problem with this approach though. Testcontainers is not going to stop reusable container ever. You have to stop and destroy the container manually.

I can't imagine some hot reload magic flag for testing - there is just so much stuff that can dirty the spring context, dirty the database etc.
In my opinion the easiest thing to do here is to locally replace test container initializer with manual container start and change the properties for the database to point to this container. If you want some automation for this - you could add before launch script (if you are using IntelliJ...) to do something like that: docker start postgres || docker run postgres (linux), which will start the container if its not running and do nothing if it is running.
Usually IDE recompiles just change affected classes anyway and Spring context probably wont start for 15 secs without a container starting, unless you have a lot of beans to configure as well...

I'm trying to learn testing with Spring Boot, so sorry if this answer is not relevant.
I came across this video that suggests a combination of (in order of most to least used):
Mockito unit tests with the #Mock annotation, with no Spring context when it's possible
Slice tests using the #WebMvcTest annotation, when you want to involve some Spring context
Integration tests with #SpringBootTest annotation, when you want to involve the entire Spring Context

How to add data derived from source code file name or content to a resource at build time

Background
I'm adding database migrations to an existing project using the open source project mongobee. The actual migrations are all handled by mongobee and its concepts of changelogs and changesets. Part of this enhancement involves checking the current MongoDB database migration version at runtime and comparing it against the version expected by the java application. The reasoning behind this is we'd like to have an installation of our product download code updates (new *.wars) and upon logging in the new version of the application, the admin user would be prompted to update the database if their database version is lower than expected.
We're currently using Maven to package and build our software.
Problem
The one area that's nagging me is how to handle tagging the database version the Java source code expects. I'd like to avoid manually entering this each time we do a build and add a migration.
My proposed solution may not be ideal. My initial thought is to use a convention for the changelog file and class names like "v0001_first_migration" and then at build time, use maybe the maven AntRun plugin to call a separately compiled java file that traverses the migration changelog directory and looks for the latest migration number and then stores that result in a resource file, probably XML. The application can then read that XML file at runtime to get the database version it expects.
1 - Is this feasible?
2 - Is there a way to do something like this in pure Maven without using AntRun?
3 - Is there another option to accomplish this easier?

As an alternative to my proposed solution above, I used a reflection project found here: https://github.com/ronmamo/reflections and iterated through all of the classnames in my migrations directory that follow the aforementioned convention (v0001_first_migration, v0002_second_migration). I parse those using regex to get an Integer and do comparisons to determine the migration version expected by the app. The database side was a lot easier so I won't go over that.
Now, instead of using Ant tasks I'm just popping the expected app migration version into a singleton (gross I know) or alternatively just calling the function that finds the expected app migration depending on where it's used.
WHY a Singleton? The parsing process is expensive and I expect to use this data on each REST call that wants to touch our database. In the REST layer I created the singleton because of some limitations with our current project. The better way here is in the case of Tomcat, create a ServletListener and assign the migration version as an attribute of the ServletContext. Due to the way our REST layer works, I'd be modifying a TON of function signatures to pass in the #Context ServletContext. We don't have Dependency Injection containers either so my options were limited if I didn't want to touch almost every action in the REST layer. The Singleton gets the expected app migration version at startup and that's it so it's still easy to test with mocks and there are no concurrency issues that I can see.

Spring profiles - risky code in svn

We are developing a project with spring framework.
we are using a tomcat cluster and in order to do some really advanced integration tests we added some controllers to the web app that are allowing some risky stuff that must not reach the production.
What we learned is that in order to do so we can use spring profiles and annotate the risky controllers as with the
#profile("Staging")
This annotation makes sure the bean will be created only when the active profile is "Staging".
Call me paranoid but this risky code now resides on our svn and is part of the project code.
It seems that the slieghtest mistake can lead to this code be part of production and allowing risky actions for exploiters.
moreover if some programmer forgets to annotate the code will reach the production for sure.
we all make mistakes.
Is there any mitigation for this issue?

I'll call you a bit paranoid. (wink) Hopefully you also have integration tests in your application, and they usually set up some of the environment - if they ever were to run in a production environment, they would probably screw up your database, send messages to other systems, etc.
You you don't worry about that. Why? Maybe you can use the answer to that to answer how you should package those risky pieces of code.
My suggestion: keep all the risky code in a single module (if you are using a multi-module build). Don't include this module in the production build (you can use maven profiles for that)
Or.. let the code check for itself whether it is allowed to run. Perhaps it can check for the presence of a certain file on the file system that you only create in your test environment.
It depends really on what you worry about.
But it is good to think about it. I know stories where load testing resulted in many orders being placed in an actual (external) order processing system.

The mistake you are speaking about is adding staging to list of active profiles. Yes, it is easy to do this. However it is easy to remove files from file system format the hard disk and turn the electricity off. So, your question really sounds as a kind of paranoia... :)
I think that the problem is not in Spring profiles but in your development methodology. If you are not sure in some code it should not be in production at all. How to achieve this? Move from svn to git. And start using branches. Each task is a branch. Without exceptions. Each task must be tested. So you can deploy every branch you want to staging, test it and when you are sure that the code is ok merge/rebase it to master. Master should be tested as well, and then can be deployed to production.
In this case you do not need profile "staging".

Is there a way to prevent Maven Test from rebuilding the database?

I've recently been asked to, effectively, sell my department on unit testing. I can't tell you how excited this makes me, but I do have one concern. We're using JUnit with Spring and Maven, and this means that each time mvn test is called, it rebuilds the database. Obviously, we can't integrate that with our production server -- it would kill valuable data.
How do I prevent the rebuilding without telling maven to skip testing?
The best I could figure was to assign the script to operate in a test database (line breaks added for readability):
mvn test
-Ddbunit.schema=<database>test
-Djdbc.url=jdbc:mysql://localhost/<database>test?
createDatabaseIfNotExist=true&
useUnicode=true&characterEncoding=utf-8
I can't help but think there must be a better way.
I'm especially interested in learning if there is an easy way to tell Maven to only run tests on particular classes without building anything else? mvn -Dtest=<test-name> test still rebuilds the database.
======= update =======
Bit of egg on my face here. I didn't realize that I was using the same variable in two places, meaning that the POM was using a "skip.test" variable for both rebuilding the database and for running the tests...

Update: I guess that DBUnit does the rebuilding of the DB because it is told to do so in the test setup method. If you change your setup method, you can eliminate the DB rebuild. Of course, you should do it so that you get the DB reset when you need it, and omit it when you don't. My first bet would be to use a system property to control this. You can set the property on the command line the same way you already do with jdbc.url et al. Then in the setup method you add an if to test for that property and do the DB reset if it is set.
A test database, completely separated from your production DB is definitely the best choice if you can have it. You can even use e.g. Derby, an in-memory DB which can run embedded within the JVM. But in case you absolutely can't have a separate DB, use at least a separate test schema inside that DB.
In this scenario I would recommend you put your DB connection parameters into profiles within your pom, the default being the test DB, and a separate profile to contain the production settings. This way it can never happen that you accidentally run your tests against the production DB.
In general, however, it is also important to understand that tests run against a DB are not really unit tests in the strict sense, rather integration tests. If you have an existing set of such tests, fine, use them as much as you can. However, you should try to move towards adding more real unit tests, which test only a small, isolated portion of your code at once (a method or class at most), ideally self contained (need no DB, net, config files etc.) so they can run fast - this is a very important point. If you have 5000 unit tests and each takes only 5 seconds to run, that totals up to almost 7 hours, so you obviously won't run them very often. If a test takes only 5 milliseconds, you get the results in less than half a minute, so you can afford to run all your tests before you commit your latest change - many times a day. That makes a huge difference in the speed of feedback you get from the tests.
Hope this helps.

We're using JUnit with Spring and Maven, and this means that each time mvn test is called, it rebuilds the database.
Maven doesn't do anything with databases by itself, your code does. In any case, it's very unusual to run tests (which are not unit tests) against a production database.
How do I prevent the rebuilding without telling maven to skip testing?
Hard to say without more details (you're not showing anything) but profiles might be a way to go.

Unit tests, by definition, only operate on a single component in the system. You should not be attempting to write unit tests which integrate with any external services (web, DB, etc.). The solution I have to this is to use a good mocking framework to stub out the behaviour of any dependencies your components have. This encourages good interface APIs since most mocking frameworks work best with simple interfaces. It would be best to create a Repository pattern interface for any interactions with your DB and then mock out the impl any time you are testing a class that interacts with it. You can then functionally test your Repository impl separately. This also has the added benefit of keeping your unit tests fast enough to remain part of your CI so that your feedback cycle is as fast as possible.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.