Flyway partial migration of legacy application - java

In an application with a custom database migrator which we want to replace with Flyway.
These migrations are split into some categories like "account" for user management and "catalog" for the product catalog.
Files are named $category.migration.$version.sql. Here, $category is one of the above categories and $version is an integer version starting from 0.
e.g. account.migration.23.sql
Although one could argue that each category should be a separate database, in fact it isn't and a major refactoring would be required to change that.
Also I could use one schema per category, but again this would require rewriting all SQL queries.
So I did the following:
Move $category.migration.$version.sql to /sql/$category/V$version__$category.sql (e.g. account.migration.1.sql becomes /sql/account/V1_account.sql)
Use a metadata table per category
set the baseline version to zero
In code that would be
String[] _categories = new String[] { "catalog", "account" };
for (String _category : _categories) {
Flyway _flyway = new Flyway();
_flyway.setDataSource(databaseUrl.getUrl(), databaseUrl.getUser(), databaseUrl.getPassword());
_flyway.setBaselineVersion(MigrationVersion.fromVersion("0"));
_flyway.setLocations("classpath:/sql/" + applicationName);
_flyway.setTarget(MigrationVersion.fromVersion(_version + ""));
_flyway.setTable(category + "_schema_version");
_flyway.setBaselineOnMigrate(true); // (1)
_flyway.migrate();
}
So there would be the metadata tables catalog_schema_version and account_schema_version.
Now the issue is as follows:
Starting with an empty database I would like to apply all pre-existing migrations per category, as done above.
If I remove _flyway.setBaselineOnMigrate(true); (1), then the catalog migration (the first one) succeeds, but it would complain for account that the schema public is not empty.
Likewise setting _flyway.setBaselineOnMigrate(true); causes the following behavior:
The migration of "catalog" succeeds but V0_account.sql is ignored and Flyway starts with V1_account.sql, maybe because it somehow still thinks the database was already baselined?
Does anyone have a a suggestion for resolving the problem?

Your easiest solution is to keep the schema_version tables in another schema each. I've answered a very similar question here.
Regarding your observation on baseline, those are expected traits. The migration of account starts at v1 because with the combination of baseline=0, baselineOnMigrate=true and a non empty target schema (because catalog has populated it) Flyway has determined this is a pre-existing database that is equal to the baseline - thus start at v1.

Related

Update "from" part of a Apache Camel JPA route

I have a Apache Camel route between two JPA endpoints:
from("jpa://Data").to("jpa://DataConverted");
I basically want to do two things: fetch and copy data from my Data entity table to a similar dataConverted entity table in another database, and mark my Data entities with data.hasBeenCopied(true), only after successfully copying it though.
My route looks as follows:
from("jpa://Data").process(ex -> {
Data data = ex.getIn().getBody(Data.class);
DataConverted dataConverted = convertData(data);
ex.getMessage().setBody(dataConverted);
})
.recipientList(constant("direct:DataConverted","direct:updateFlag")).end();
from("direct:DataConverted").to("jpa://DataConverted").end;
from("direct:updateFlag").process(ex -> {
DataConverted dataConverted = ex.getIn().getBody(DataConverted.class);
var originalData = myDao.getData(dataConverted.getId());
originalData.setHasBeenCopied(true);
}).to("jpa://Data).end();
This runs without error, however it isn't setting the flag in my original database!
What did work was to call data.setHasBeenCopied(true); in the first process directly after from("jpa://Data") - however, this means that the flag is already set and if something happens with the copy process (e.g. the target database isn't available) the route will crash but the flag will stay set for that one data entity.
Note that I haven't called transacted() on my route as that didn't work out for me (multiple interfering transactions were opened).
Any idea how to proceed? Is Camel unable to update existing data via .to()? I can add my Camel configurations of the endpoints and such if needed, but it would probably get a bit long.

Android Room Partial Migration Testing

The codebase I'm working on (NewPipe) uses Android Room. It has an AppDatabase which extends RoomDatabase (the Android Room class), a StreamDAO, and a StreamEntity. I added a column to StreamEntity, and I incremented the #Database version from 3 to 4. I also added a Migration from 3 to 4.
The problem is there was previously a test testing the Migration from version 2 to 3. When I try to run the test, I get the error java.lang.IllegalStateException: A migration from 3 to 4 was required but not found. Please provide the necessary Migration path via RoomDatabase.Builder.addMigration(Migration ...) or allow for destructive migrations via one of the RoomDatabase.Builder.fallbackToDestructiveMigration* methods.. I can fix this error by adding .addMigrations(MIGRATION_3_4) to this line. But that then also runs the migration from version 3 to 4, which I would like to isolate to a separate test.
The getMigratedDatabase() function is actually only needed in the test for data validation (in addition to the automated migration verification). I am able to get the data from the (partially) migrated database by running queries on the partially migrated database, but I can't get the data as a StreamEntity.
How can I test partially migrating the database as well as access the StreamDAO on the partially migrated database?
Edit:
I understand (from the Android Developers Testing Single Migrations) that You cannot use DAO classes because they expect the latest schema.. I can get all the data out with (kotlin):
query("SELECT * FROM streams").run {
DatabaseUtils.dumpCursorToString(this)
}
However, I can't convert it to StreamEntity for easier data testing.
If you want to test data which only half-migrated, you would have to create a matching (legacy) DB, dao and entities (not recommended).
I think you're better off reading the separate column values and then either just examine those, or take the values and construct the SteamEntity yourself.
Something like this (Java):
db = helper.runMigrationsAndValidate(AppDatabase.DATABASE_NAME, 3, false, MIGRATION_2_3);
Cursor cursor = db.query("SELECT * FROM " + TEST_DB + ";" );
cursor.moveToFirst();
assertEquals(expectedColumnValue, cursor.getString(cursor.getColumnIndex("columnName1" )));
assertNull(cursor.getString(cursor.getColumnIndex("columnName2" )));
Then add another test for the whole migration (to v4), there you can use your Dao methods and examine StreamEntity directly and confirm that the DAO is constructing StreamEntity properly.

Rolling back a Postgres database using Liquibase in Java

I'm trying to roll back the changes to a postgres table inbetween component tests so each one has a clean db to work with.
I'm using liquibase to set up postgres (the changelog xml to describe the setup and then the liquibase-core Kotlin/Java library to apply it). I'm also using Hibernate to interact with postgres directly. The test framework I'm using is Kotest, using the beforeXXX methods to make sure all the setup happens before the tests run. The database is set up once before everything runs and the idea is to rollback after each test.
From looking in the docs I've found tagDatabase and rollback seem to be what I need, however when running them they don't seem to actually roll anything back.
The code is roughly as follows (this is just test code to see if it works at all, mind - code would ideally be segmented as I descirbed above):
// 1 - (Pre-all-tests) Postgres Setup
liquibase = Liquibase(
"/db/changelog/changelog-master.xml",
ClassLoaderResourceAccessor(),
DatabaseFactory.getInstance().findCorrectDatabaseImplementation(JdbcConnection(connection))
)
liquibase.update(Contexts(), LabelExpression())
liquibase.tag("initialised")
// 2 - Something is inserted
val newEntity = ThingEntity()
entityManager.persist(
entity
)
entityManager.transaction.commit()
entityManager.clear()
// 3 - Cleanup
liquibase.rollback("initialised", Contexts())
// 4 - Fetching
entityManager.find(ThingEntity::class.java, id)
Thing is, after running liquibase.rollback the newEntity I persisted earlier is still present. The tag has dissapeared - if I run the doesTagExist method it returns true and then false after the rollback so the tag is being removed at least.
Given I'm clearing the entity manager after the commit I don't think it's because it's being cached and as I said the tag is being removed - just not the data.
Can anyone tell my why the actual transactions (i.e. the persist) aren't being erased?
Thanks!
Looks like you are using liquibase in a wrong way. What you are trying to have (rollback of data that is added in unit-test) is something close to what is described here: Rollback transaction after #Test
And when you are asking liquibase to rollback to some tag it just executes rollback scripts (if any provided) for changesets that were applied after changeset with tag: https://docs.liquibase.com/commands/community/rollbackbytag.html

Failed to make bulk upsert using mongo

I'm trying to do upsert using mongodb driver, here is a code:
BulkWriteOperation builder = coll.initializeUnorderedBulkOperation();
DBObject toDBObject;
for (T entity : entities) {
toDBObject = morphia.toDBObject(entity);
builder.find(toDBObject).upsert().replaceOne(toDBObject);
}
BulkWriteResult result = builder.execute();
where "entity" is morphia object. When I'm running the code first time (there are no entities in the DB, so all of the queries should be insert) it works fine and I see the entities in the database with generated _id field. Second run I'm changing some fields and trying to save changed entities and then I receive the folowing error from mongo:
E11000 duplicate key error collection: statistics.counters index: _id_ dup key: { : ObjectId('56adfbf43d801b870e63be29') }
what I forgot to configure in my example?
I don't know the structure of dbObject, but that bulk Upsert needs a valid query in order to work.
Let's say, for example, that you have a unique (_id) property called "id". A valid query would look like:
builder.find({id: toDBObject.id}).upsert().replaceOne(toDBObject);
This way, the engine can (a) find an object to update and then (b) update it (or, insert if the object wasn't found). Of course, you need the Java syntax for find, but same rule applies: make sure your .find will find something, then do an update.
I believe (just a guess) that the way it's written now will find "all" docs and try to update the first one ... but the behavior you are describing suggests it's finding "no doc" and attempting an insert.

How to store all user activites in a website..?

I have a web application build in Django + Python that interact with web services (written in JAVA).
Now all the database management part is done by web-services i.e. all CRUD operations to actual database is done by web-services.
Now i have to track all User Activities done on my website in some log table.
Like If User posted a new article, then a new row is created into Articles table by web-services and side by side, i need to add a new row into log table , something like "User : Raman has posted a new article (with ID, title etc)"
I have to do this for all Objects in my database like "Article", "Media", "Comments" etc
Note : I am using PostgreSQL
So what is the best way to achieve this..?? (Should I do it in PostgreSQL OR JAVA ..??..And How..??)
So, you have UI <-> Web Services <-> DB
Since the web services talk to the DB, and the web services contain the business logic (i.e. I guess you validate stuff there, create your queries and execute them), then the best place to 'log' activities is in the services themselves.
IMO, logging PostgreSQL transactions is a different thing. It's not the same as logging 'user activities' anymore.
EDIT: This still means you create DB schema for 'logs' and write them to DB.
Second EDIT: Catching log worthy events in the UI and then logging them from there might not be the best idea either. You will have to rewrite logging if you ever decide to replace the UI, or for example, write an alternate UI for, say mobile devices, or something else.
For an audit table within the DB itself, have a look at the PL/pgSQL Trigger Audit Example
This logs every INSERT, UPDATE, DELETE into another table.
In your log table you can have various columns, including:
user_id (the user that did the action)
activity_type (the type of activity, such as view or commented_on)
object_id (the actual object that it concerns, such as the Article or Media)
object_type (the type of object; this can be used later, in combination with object_id to lookup the object in the database)
This way, you can keep track of all actions the users do. You'd need to update this table whenever something happens that you wish to track.
Whenever we had to do this, we overrode signals for every model and possible action.
https://docs.djangoproject.com/en/dev/topics/signals/
You can have the signal do whatever you want, from injecting some HTML into the page, to making an entry in the database. They're an excellent tool to learn to use.
I used django-audit-log and I am very satisfied.
Django-audit-log can track multiple models each in it's own additional table. All of these tables are pretty unified, so it should be fairly straightforward to create a SQL view that shows data for all models.
Here is what I've done to track a single model ("Pauza"):
class Pauza(models.Model):
started = models.TimeField(null=True, blank=False)
ended = models.TimeField(null=True, blank=True)
#... more fields ...
audit_log = AuditLog()
If you want changes to show in Django Admin, you can create an unmanaged model (but this is by no means required):
class PauzaAction(models.Model):
started = models.TimeField(null=True, blank=True)
ended = models.TimeField(null=True, blank=True)
#... more fields ...
# fields added by Audit Trail:
action_id = models.PositiveIntegerField(primary_key=True, default=1, blank=True)
action_user = models.ForeignKey(User, null=True, blank=True)
action_date = models.DateTimeField(null=True, blank=True)
action_type = models.CharField(max_length=31, choices=(('I', 'create'), ('U', 'update'), ('D', 'delete'),), null=True, blank=True)
pauza = models.ForeignKey(Pauza, db_column='id', on_delete=models.DO_NOTHING, default=0, null=True, blank=True)
class Meta:
db_table = 'testapp_pauzaauditlogentry'
managed = False
app_label = 'testapp'
Table testapp_pauzaauditlogentry is automatically created by django-audit-log, this merely creates a model for displaying data from it.
It may be a good idea to throw in some rude tamper protection:
class PauzaAction(models.Model):
# ... all like above, plus:
def save(self, *args, **kwargs):
raise Exception('Permission Denied')
def delete(self, *args, **kwargs):
raise Exception('Permission Denied')
As I said, I imagine you could create a SQL view with the four action_ fields and an additional 'action_model' field that could contain varchar references to model itself (maybe just the original table name).

Categories

Resources