Liquibase generate change log on db based on table name prefix - java

Can I generate Liquibase changelog from a DB based on table name prefix.
Example:
If I have a DB schema and it has following tables:
abc
abcd
abcdef
xyz
I just want to generate ChangeLog for tables starting with "abc". So changelog for tables
abc,
abcd,
abcdef
Can someone help me if there's a way to do this?

It's possible with maven or liquibase command line if you're using liquibase version > 3.3.2.
Take a look at the release notes
Liquibase 3.3.2 is officially released. It is primarily a bugfix
release, but has one major new feature: object
diffChangeLog/generateChangeLog object filtering.
includeObjects/excludeObjects logic
You can now set an includeObjects or excludeObjects paramter on the
command line or Ant. For maven, the parameteres are diffExcludeObjects
and diffIncludeObjects. The format for these parameters are:
An object name (actually a regexp) will match any object whose name matches the regexp.
A type:name syntax that matches the regexp name for objects of the given type
If you want multiple expressions, comma separate them
The type:name logic will be applied to the tables containing columns, indexes, etc.
NOTE: name comparison is case sensitive. If you want insensitive
logic, use the (?i) regexp flag.
Example Filters:
“table_name” will match a table called “table_name” but not “other_table” or “TABLE_NAME”
“(i?)table_name” will match a table called “table_name” and “TABLE_NAME”
“table_name” will match all columns in the table table_name
“table:table_name” will match a table called table_name but not a column named table_name
“table:table_name, column:*._lock” will match a table called table_name and all columns that end with “_lock”
So try using excludeObjects or includeObjects parameters with generateChangeLog command
UPDATE
I've used liquibase command line, and this command does the trick (for mysql database):
liquibase
--changeLogFile=change.xml
--username=username
--password=password
--driver=com.mysql.cj.jdbc.Driver
--url=jdbc:mysql://localhost:3306/mydatabase
--classpath=mysql-connector-java-8.0.18.jar
--includeObjects="table:abc.*"
generateChangeLog

This works for me on Windows 10:
liquibase.properties:
changeLogFile=dbchangelog.xml
classpath=C:/Program\ Files/liquibase/lib/mysql-connector-java-8.0.20.jar
driver=com.mysql.cj.jdbc.Driver
url=jdbc:mysql://localhost:3306/liquibase?serverTimezone=UTC
username=root
password=password
schemas=liquibase
includeSchema=true
includeTablespace=true
includeObjects=table:persons
C:\Users\username\Desktop>liquibase generateChangeLog
Liquibase Community 4.0.0 by Datical
Starting Liquibase at 11:34:35 (version 4.0.0 #19 built at 2020-07-13 19:45+0000)
Liquibase command 'generateChangeLog' was executed successfully.
You can download mysql-connector here, find the generateChangeLog documentation here and more information on includeObjects here.

Related

creating hive table using gcloud dataproc not working for unicode delimiter

I need to create a hive table on a unicode delimited file(unicode charcter - ."\uFFFD", replacement character)
To do this we are submitting hive jobs to cluster.
Tried with Lazy simple serde using ROW FORMAT Delimited -
gcloud dataproc jobs submit hive --cluster --region
--execute "CREATE EXTERNAL TABLE hiveuni_test_01(codes
string,telephone_num string,finding_name string,given_name
string,alt_finding_name string,house_num string,street_name
string,locality string,state string,reserved string,zip_code
string,directive_text string,special_listing_text string,id
string,latitude string,longitude string,rboc_sent_date string) ROW
FORMAT DELIMITED FIELDS TERMINATED BY '\uFFFD' LINES TERMINATED BY
'\n' STORED AS TEXTFILE LOCATION
'gs://hive-idaas-dev-warehouse/datasets/unicode_file';"
But this does not create the table correctly , entire row is put to the first column only.
We are using cloud SQL mysql server as hive metastore , checked that mysql has utf8 encoding also.
Tried with multidelimitserde -
gcloud dataproc jobs submit hive --cluster
dev-sm-35cb3516-ed82-4ec2-bf0d-89bd7e0e60f0 --region us-central1
--jars gs://hive-idaas-dev-warehouse/hive-jar/hive-contrib-0.14.0.jar --execute "CREATE EXTERNAL TABLE hiveuni_test_05 (codes string,telephone_num string,finding_name string,given_name
string,alt_finding_name string,house_num string,street_name
string,locality string,state string,reserved string,zip_code
string,directive_text string,special_listing_text string,id
string,latitude string,longitude string,rboc_sent_date string) ROW
FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' WITH
SERDEPROPERTIES ('field.delim'='\uFFFD') STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION
'gs://hive-idaas-dev-warehouse/datasets/unicode_file';"
This gives an exception - java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.serde2.MultiDelimitSerDe not found
I have put an initialization script during start of the cluster which will place the hive-contrib-0.14.0.jar containing the class org.apache.hadoop.hive.serde2.MultiDelimitSerDe in /usr/lib/hadoop/lib/. I see that jar is placed in the folder by doing ssh to the cluster.
Is there a way to read unicode characters by hive client while creating table or why do I still get an error classNotFound even after placing the jar in hadoop lib directory?
hive-contrib-0.14.0 does not have org.apache.hadoop.hive.serde2.MultiDelimitSerDe. Instead the full qualified class name is org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe. Notice the extra contrib there.
So change your query to use the correct fully qualified class name and see if it solves the issue. You probably don't have to explicitly add a hive-contrib jar. It should be already under /usr/lib/hive/lib.
HIVE-20020 and HIVE-20619 were done on Hive 4.0, and since you are using Dataproc, it shouldn't apply since Dataproc does not have Hive 4.0 yet.

Flyway partial migration of legacy application

In an application with a custom database migrator which we want to replace with Flyway.
These migrations are split into some categories like "account" for user management and "catalog" for the product catalog.
Files are named $category.migration.$version.sql. Here, $category is one of the above categories and $version is an integer version starting from 0.
e.g. account.migration.23.sql
Although one could argue that each category should be a separate database, in fact it isn't and a major refactoring would be required to change that.
Also I could use one schema per category, but again this would require rewriting all SQL queries.
So I did the following:
Move $category.migration.$version.sql to /sql/$category/V$version__$category.sql (e.g. account.migration.1.sql becomes /sql/account/V1_account.sql)
Use a metadata table per category
set the baseline version to zero
In code that would be
String[] _categories = new String[] { "catalog", "account" };
for (String _category : _categories) {
Flyway _flyway = new Flyway();
_flyway.setDataSource(databaseUrl.getUrl(), databaseUrl.getUser(), databaseUrl.getPassword());
_flyway.setBaselineVersion(MigrationVersion.fromVersion("0"));
_flyway.setLocations("classpath:/sql/" + applicationName);
_flyway.setTarget(MigrationVersion.fromVersion(_version + ""));
_flyway.setTable(category + "_schema_version");
_flyway.setBaselineOnMigrate(true); // (1)
_flyway.migrate();
}
So there would be the metadata tables catalog_schema_version and account_schema_version.
Now the issue is as follows:
Starting with an empty database I would like to apply all pre-existing migrations per category, as done above.
If I remove _flyway.setBaselineOnMigrate(true); (1), then the catalog migration (the first one) succeeds, but it would complain for account that the schema public is not empty.
Likewise setting _flyway.setBaselineOnMigrate(true); causes the following behavior:
The migration of "catalog" succeeds but V0_account.sql is ignored and Flyway starts with V1_account.sql, maybe because it somehow still thinks the database was already baselined?
Does anyone have a a suggestion for resolving the problem?
Your easiest solution is to keep the schema_version tables in another schema each. I've answered a very similar question here.
Regarding your observation on baseline, those are expected traits. The migration of account starts at v1 because with the combination of baseline=0, baselineOnMigrate=true and a non empty target schema (because catalog has populated it) Flyway has determined this is a pre-existing database that is equal to the baseline - thus start at v1.

Create database upgrade diff with Liquibase

I have two database schema files, both of them are empty. Let's say that there's a version 1 and a version 2 of the database schema in db.v1.sql and db.v2.sql.
I'd like create a diff which will update a database with the schema db.v1.sql to db.v2.sql.
Is Liquibase is capable to do that?
Is there another tool to do it from Java?
Yes, this is do-able using Liquibase.
Create a changelog.xml file that lists the .sql files as separate changesets. Think of this file as 'tempChangeLog.xml' In this file, add a label attribute to each of the changesets with "v1" or "v2".
Use liquibase update to apply the first label to a first database instance.
Use liquibase generateChangelog to "convert" the sql to liquibase xml changesets. This will be your 'realChangeLog.xml'
Modify the 'realChangeLog.xml to add "v1" label attribute to all the changesets.
Use liquibase update with 'tempChangeLog.xml' to apply the second label to a second database instance.
Use the liquibase diffChangelog command to compare database instance 1 with database instance 2, appending the changes to 'realChangeLog.xml'
Modify 'realChangeLog.xml' again to add "v2" labels to all the new changesets.
You will now have a changelog.xml that can be used to update a database to either v1 or v2.
Synchronizing new changes with your ORM is a separate exercise.

How to ignore placeholder expressions for Flyway?

I am using flyway version 2.3, I have an sql patch which inserts a varchar into a table having character sequence that Flyway treats as placeholders. I want to flyway to ignore placeholders and run the script as is.
The script file is
insert into test_data (value) values ("${Email}");
And the Java code is
package foobar;
import com.googlecode.flyway.core.Flyway;
public class App
{
public static void main( String[] args )
{
// Create the Flyway instance
Flyway flyway = new Flyway();
// Point it to the database
flyway.setDataSource("jdbc:mysql://localhost:3306/flywaytest", "alpha", "beta");
// Start the migration
flyway.migrate();
}
}
This can be done by splitting $ and { in the expression:
insert into test_data (value) values ('$' || '{Email}')
You can change the value of the placeholder suffix or prefix to a different value and you should be OK.
try this properties:
final var flyway = Flyway.configure()
.dataSource(DataSourceProvider.getInstanceDataSource())
.locations("path")
.outOfOrder(true)
.validateOnMigrate(false)
.placeholderReplacement(false)
.load();
In my MySQL migration script this worked:
I just escaped the first { characters, like this:
'...<p>\nProgram name: $\{programName}<br />\nStart of studies: $\{startOfStudies}<br />\n($\{semesterNote})\n</p>...'
This way flyway didn't recognize them as placeholders, and the string finally stored doesn't contain the escape character.
...<p>
Program name: ${programName}<br />
Start of studies: ${startOfStudies}<br />
(${semesterNote})
</p>...
I had exactly the same problem, but the accepted answer didn't fit my requirements. So I solved the problem in another way and post this answer hoping that it'll be useful to other people coming here from Google search.
If you cannot change the placeholder suffix and prefix, you can trick Flyway into believing there are no placeholders by using an expression. E.g.:
INSERT INTO test_data(value) VALUES (REPLACE("#{Email}", "#{", "${"));
This is useful if you've already used placeholders in lots of previous migrations. (If you just change placeholder suffix and prefix, you'll have to change them in previous migration scripts, too. But then the migration script checksums won't match, Flyway will rightfully complain, and you'll have to change checksums in the schema_version table by calling Flyway#repair() or manually altering the table.)
Just add a property to your bootstrap.properties (or whatever you use)
flyway.placeholder-replacement = false
In 2021, the simple answer is to set the placeholderReplacement boolean to false:
flyway -placeholderReplacement="false"
The configuration parameter placeholderReplacement determines whether placeholders should be replaced.
Reference: https://flywaydb.org/documentation/configuration/parameters/placeholderReplacement

Error in importing a tsv to hbase

I created a table in hbase using:
create 'Province','ProvinceINFO'
Now, I want to import my data from a tsv file to it. My table in tsv have two columns: ProvinceID (as pk), ProvinceName
I am using the below code for import:
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv '-Dimporttsv.separator=,'
-Dimporttsv.columns= HBASE_ROW_KEY, ProvinceINFO:ProvinceName Province /usr/data
/Province.csv
but it gives me this error:
ERROR: No columns specified. Please specify with -Dimporttsv.columns=...
Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
Imports the given input directory of TSV data into the specified table.
The column names of the TSV data must be specified using the -Dimporttsv.columns
option. This option takes the form of comma-separated column names, where each
column name is either a simple column family, or a columnfamily:qualifier. The special
column name HBASE_ROW_KEY is used to designate that this column should be used
as the row key for each imported record. You must specify exactly one column
to be t he row key, and you must specify a column name for every column that exists in
the
input data. Another special columnHBASE_TS_KEY designates that this column should be
used as timestamp for each record. Unlike HBASE_ROW_KEY, HBASE_TS_KEY is optional.
You must specify at most one column as timestamp key for each imported record.
Record with invalid timestamps (blank, non-numeric) will be treated as bad record.
Note: if you use this option, then 'importtsv.timestamp' option will be ignored.
By default importtsv will load data directly into HBase. To instead generate
HFiles of data to prepare for a bulk data load, pass the option:
-Dimporttsv.bulk.output=/path/for/output
Note: if you do not use this option, then the target table must already exist in HBase
Other options that may be specified with -D include:
-Dimporttsv.skip.bad.lines=false - fail if encountering an invalid line
'-Dimporttsv.separator=|' - eg separate on pipes instead of tabs
-Dimporttsv.timestamp=currentTimeAsLong - use the specified timestamp for the import
-Dimporttsv.mapper.class=my.Mapper - A user-defined Mapper to use instead of
org.apache.hadoop.hbase.mapreduce.TsvImporterMapper
-Dmapred.job.name=jobName - use the specified mapreduce job name for the import
For performance consider the following options:
-Dmapred.map.tasks.speculative.execution=false
-Dmapred.reduce.tasks.speculative.execution=false
Maybe also try wrapping column into a string, i.e.
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=','
-Dimporttsv.columns="HBASE_ROW_KEY, ProvinceINFO:ProvinceName" Province /usr/data
/Province.csv
You should try something like:
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=','
-Dimporttsv.columns= HBASE_ROW_KEY, ProvinceINFO:ProvinceName Province /usr/data
/Province.csv
Try to remove the spaces in -Dimporttsv.columns=a,b,c.

Categories

Resources