Enhance persistence.xml for database update - java

For development and deployment of my WAR application I use the drop-and-create functionality. Basically erasing everything from the database and then automatically recreating all the necessary tables and fields according to my #Entity-classes.
Obviously, for production the drop-and-create functionality is out of question. How would I have to create the database tables and fields?
The nice thing about #Entity-classes is that due to OQL and the use of EntityManager all the database queries are generated, hence the WAR application gets database independent. If I now had to create the queries by hand in SQL and then let the application execute them, then I would have to decide in which sql dialect they are (i.e. MySQL, Oracly, SQL Server, ...). Is there a way to create the tables database independently? Is there a way to run structural database updates as well database independently (i.e. for database version 1 to database version 2)? Like altering field or table names, adding tables, droping tables, etc.?

Thank you #Qwerky for mentioning Liquibase. This absolutely is a solution and perfect for my case as I won't have to worry about versioning anymore. Liquibase is very easy to understand and studied in minutes.
For anyone looking for database versioning / scheme appliance:
Liquibase

Related

Java Application - Can i Store my sql queries in the DB rather than a file packaged inside the application?

As the application gets complicated, one thing that change a lot is the queries, especially if they are complex queries. Wouldn't it be easier to maintain the queries in the db rather then the resources location inside the package, so that it can be enhanced easily without a code change. What are the drawbacks of this?
You can use stores procedures, to save your queries in the database. Than your Java code can just call the procedure from the database instead of building a complex query.
See wikipedia for a more detailed explanation about stored procedures:
https://en.wikipedia.org/wiki/Stored_procedure
You can find details about the implementation and usage in the documentation of your database system (MySql, MariaDb, Oracle...)
When you decide to move logic to the database, you should use a version control system for databases like liquibase: https://www.liquibase.org/get-started/quickstart
You can write the changes to you database code in xml, json or even yaml and check that in in your version control system (svn, git...). This way you have a history of the changes and can roll back to a previous version of your procedure, if something goes wrong.
You also asked, why some people use stored procedures and others keep their queries in the code.
Stored procedures can encapsulate the query and provide an interface to the data. They can be faster than queries. That is good.
But there are also problems
you distribute the buisiness logic of your application to the database and the programm code. It can realy be troublesome, if the logic is spread through all technical layers of your applicaton.
it is not so simple anymore to switch from a Oracle database to a MariaDb, if you use specific features of the database system. You have to migrate or rewrite the procedures.
you have to integrate liquibase or another system into you build pipeline, to keep track of you database changes.
So it depends on the project and it's size, if either of the solutions is better.

database independent data migration

My goal is to enable schema and data migration for an existing application.
This kind of question seems to have been asked many times, however with different requirements and circumstances as mine, I think.
Since I am inexperienced in this domain, allow me to lay out the architecture of the app and my assumptions first.
Architecture
The app is a multi-user, enterprise desktop application with a backend server that can persist to any major DB (MySql, Postgresql, SQL Server, Oracle DB, etc). It is assumed the DB is on-premise and maintained by our clients.
The tech stack used is a fairly common Hibernate+Spring+RMI/JMS-Combo.
Currently, migrations are done by the server in the following way:
On server start it checks for the latest expected schema version
If larger than the current version, start migration to next version until current==latest:
Create new database
Load (whole) latest schema (SQL script with a lot of CREATE TABLE ...)
Migrate data (in Java classes using 2 JDBC-Connections to old and new schema)
Load (all) latest constraints (SQL script with a lot of ALTER TABLE ...)
This migration is slow and forward-only. But it is simple. The problem is, that until now the schema scripts and the queries in the data migrations have been using MySQL-syntax and features.
Note that by migrate data I mean: the backend server copies the data from the old schema to the new one, transforming it if necessary.
Also, the migration process starts automatically on-premise of our clients. Meaning, we only have control over the JDBC connection, but no direct access to the database nor knowledge about the specific database being used (MySQL, SQL Server,...).
The goal is to either replace or augment this migration scheme with a database independent one.
Assumptions and research
StackOverflow 1 2 3 4 5 6 7: Answers state to use Hibernate's inbuilt feature. However, the docs state that this is not production ready. Also, AFAICT, all answers are concerned with schema migration only.
Liquibase: Uses a custom DSL (in XML/JSON/YAML/etc) to allow for database independent schema migration only.
DBUnit: Uses custom XML-DSL to capture snapshots of states of databases. Can not recreate a snapshot of schema version 1 to version 2.
flyway: In principle same as Liquibase. But is not database independent because SQL-Scripts are used for migrations.
JOOQ: A database independent Query-DSL in Java on top of JDBC. Comparable to Criteria API but without the drawbacks of JPA. Should in principle allow for database independent data migration, however, does not help with schema migration.
JPA-Query languages like HQL, JPQL, Criteria API are not sufficient because
One cannot reference tables not mapped by the entity manager. E.g. join tables, metadata and audit tables.
A copy of all versions of the Entity classes needs to be kept around for the mapping.
Question
I realize, that as this question stands now, it will be dismissed as opinion-based.
However, I am not necessarily looking for specific solutions to this problem ( I doubt there exists a clear solution for such a complex problem space ) but rather to validate my assumptions.
Namely, is it true, that
Liquibase and Flyway are mainly concerned with schema migration and data migration is left as an exercise for the reader?
in order for Flyway to support multiple, different databases, one needs to duplicate the migrations scripts per database?
by and large, the problem of database independent data migration remains unresolved in enterprise Java?
Even if I was to combine Liquibase/Flyway with JOOQ, I do not see how to perform a data migration, because Liquibase/Flyway migrate databases in place. The old database gets destroyed and with it the opportunity to transform the old data to the new schema.
Thanks for your attention!
Let's break it down a little bit. You're right in that this is largely opinion based, but here's what I've noticed in my experiences.
Liquibase and Flyway are mainly concerned with schema migration and data migration is left as an exercise for the reader?
You can do data migration with liquibase and flyway. It's something I've done pretty often. Take the example where I want to split a User table into User and Address tables. I'd write a migration script, which is basically just a sql file, to create the new Address table and the copy all the relevant data into it.
in order for Flyway to support multiple, different databases, one needs to duplicate the migrations scripts per database?
Possibly, flyway and liquibase are better thought of as database versioning tools. If my app needs version 10 of the database, these tools would help me get to that point. Again, the migration scripts are just basic .sql files. If you're using some mysql specific functions then those will just go in the migration script and they wouldn't work on a sql server
by and large, the problem of database independent data migration remains unresolved in enterprise Java?
Eh, I'm not sure about this one. I agree its a problem, but in practice it's not a huge one. For the past 8+ years, I've only written ansi sql. It should be portable everywhere. So in theory, we can lift those applications on to a different database. JPA and the various implementations help with those differences. Depending on how your project was built, say an application that has all of its business logic in implementation specific sql functions, then it's going to be a headache. If you're using the database for CRUD, and I'd argue that's all you should be using it for, then it's not a huge deal.
So all that said, I think you might have the wrong idea about flyway and liquibase. Like i said earlier, they aren't really 'migration tools' so much as they are database versioning tools. With a list of specific sql migration scripts that are ordered, i can guarantee the state of my database at any version. I'm not sure these are tools that I'd use to 'migrate' a legacy SQL Server based application into a PostGres based application.

How to initialise data in JPA, only if the Schema is generated

Bonjour,
I am working on changing me Java application from using postgres to an embedded database. I would like the application to deploy with an initial set of data in the database. In the past during installation I have executed an sql script to fully generate the schema and insert the data in to my tables.
Ideally (becasue I don't really want to work out how to connect to the embedded database to generate it) I want to let JPA create my schema for the first time, and when it does I then want to be able to run my SQL to insert the data.
My search has turned up the obvious hibernate and JPA properties that allow running of an SQL script.
Firstly I found when using "hibernate.hbm2ddl.auto" you can define an import.sql file this made me very happy for a day until I realised it only works with create and not with update. My application when using postgres had this set to update. And what i would really like is for it to know if it's had to create the schema and if it has then run the import.sql. No Joy though.
I then moved on to using "javax.persistence.schema-generation.database.action" set to "create" I figured using the JPA specification was probably wiser anyway and so I defined "javax.persistence.sql-load-script-source" the spec says for "create"
The provider will create the database artifacts on application
deployment. The artifacts will remain unchanged after application
redeployment.
This lead me to believe it would do exactly what I wanted, only create the tables "on application deployment" however when I ran my tests using this, each test (creating a new spring context) tried to just create all the tables again and obviously failed, which made me realise application deployment didn't mean what i thought it meant (wishful thinking) and now I realise that JPA doesn't seem to even have an equivalent of Hibernates "update" property, so it's always going to generate the tables?
What I want is to have my tables and data generated when you first spin up the app and for subsequent executions to know the data is there and use it, I am assuming it's too much to hope for that this exists, but i'm sure that this must be a common requirement? So my question is what is the general recommended way to achieve the goal of allowing JPA to create my schema but being able to insert some data in to a db that persists between executions?
The answer is flyway. It is a database migration library, and if you are using Spring boot it is seamlessly integrated, with regular Spring you have to create a bean, which get a reference to the connection pool, creates a connection and does the migration.
Flyway creates a table so it keeps track of which scripts has already been applied to the database, and the scripts are simply part of the resources.
We normally use JPA to generate the initial script. This script becomes V1__initial.sql, if we need to add some data we can add V2__addUsers.sql and V3__addCustomers.sql etc.
Later when we need to rename columns or add additional tables, we simply add new scripts as part of the War file, and when the application is loaded Flyway looks at it's internal table, to see the current version, and then applies any new scripts to bring it up to de desired version.
In Spring the code would look like this
private void performFlywayMigration(DataSource dataSource) {
Flyway flyway = new Flyway();
flyway.setLocations("db/migration");
flyway.setDataSource(dataSource);
log.debug("Starting database migration.");
flyway.migrate();
log.debug("Database migration completed.");
MigrationInfo current = flyway.info().current();
if (current.getState() == MigrationState.FUTURE_SUCCESS) {
log.warn("The Database schema is version " + current.getVersion() + ", this application expects version " + flyway.getBaselineVersion().getVersion());
}
}
In general you should not JPA to create tables directly. because you sometimes need to modify the scripts, for instance on Sybase Varchar(255) means 255 bytes, so if you are storing 2 or 3 byte Unicode chars, you need more space - JPA implementation does not account for that (last time I checked).

How to handle concurrent sql updates, given database structure can change at runtime

I am developing spring mvc application
For now I am using innodb mysql but I have to develop the application to support other databases also.
Can any one please suggest me how to handle concurrent sql update on single record.
Suppose two users are trying to update same record then how to handle such scenario.
Note: My database structure is dependent on some configuration (It can change at runtime) and my spring controller is singleton in nature.
Thanks.
Update:
Just for reference I am going to implement version like https://stackoverflow.com/a/3618445/3898076).
Transactions are the way to go when it comes to concurrent sql updates, in spring you can use a transaction manager.
As for the database structure, as far as I know MySql does not support transactions for DDL commands, that is if you change the structure concurrently with updating, you're likely to run into problems.
To handle multiple users working on the same data, you need to implement a manual "lock" or "version" field on the table to keep track of last updates.

JUnit tests using a mock database

I am developping an application that tests different WebServices, and I want it to be as generic as possible. I need to populate database to do JUnit tests, but I don't want these changes to be commited.
I know that some in-memory databases like HSQL DB allow testing on a sort of a virtual (or mock) database, but unfortunately I use oracle and I cannot change it now because of my complex data tables structure.
What is the best practice you suggest?
Thanks.
First of all, HSQL and Hibernate aren't related in any way. The question is whether you can find an embedded database which supports the same SQL as your production database (or rather the subset of SQL which your application uses).
A good candidate for this is H2 database since it emulates a lot of different SQL flavours.
On top of that: Don't test the database. Assume that the database is tested thoroughly by your vendor and just works.
In my code, I aim for:
Save and load each entity.
Generate the SQL for all the queries that I use and compare them against String literals in tests (i.e. I don't run the queries against the database all the time).
Some tests look for a System property. If it's set, then they will run the queries against the database. This happens during the night on my CI server.
The rationale for this: As long as the DB schema doesn't change, there is no point to actually run the queries. That means running them during the day while I sit in front of the computer is a huge waste of time.
To make sure that "low impact" changes don't slip through the gaps, I let a computer run them when I don't care.
Along the same lines, I have mocks for many DAOs which return various predefined results, so I don't have to query the database. The rationale here is that I want to test the processing of results from the database, not the JDBC API, the DB driver, the OS's TCP/IP stack, the network hardware (and software), or any other of the 1000 things between my code and the database records on a harddisk somewhere.
More details in my blog: http://blog.pdark.de/2008/07/26/testing-with-databases/

Categories

Resources