How to load initial data (or seed data) using Java JPA?

How to load initial data (or seed data) using Java JPA? - java

I have a JPA project and I would like to insert some initial data just on development, so I can check if everything is running smoothly easy.
My research lead me to find only solution with direct SQL script, but that isn't right. If I'm using a framework to abstract database details why would I create script for an specific database?
In the ruby on rails world we have the command "rake db:seed" that simple executes a file named seed.rb that has the function to add the initial data on the database calling the abstraction layer. Is there something like that on java?
The ideal solution I can think of would be to execute a maven goal that would execute a java class, is there an easy way or a maven plugin to do it?

I feel your pain, I have gone wanting in a Java project for all of the perks Rails has.
That being said, there is no reason to use straight SQL. That approach is just asking for trouble. As your database schema changes during development, all the brittle SQL breaks. It is easier to manage data if it is mapped to JPA Models, which will abstract the SQL interaction with the database.
What you should do is use your JPA models to seed your data. Create a component that can execute the creation of models you require and persist them. In my current project, we use Snake YAML to serialize our Models as Yaml. To seed our database we deserialize the yaml to JPA models and persist.
If the models change (variable types change, remove columns, etc), you have to make sure that the serialize data will still be able to correctly deserialize into the JPA models. Using the human readable format of Yaml makes it easy to update the serialized models.
To actually run your seed data, bootstrap your system however you can. As #GeoorgeMcDowd said, you can use a Servlet. I personally prefer to create a command line tool by creating a uberjar with Class.main. Then you just need to create a script to setup your classpath and call the Class.main to run the seed.
Personally, I love Maven as project meta data but find it difficult as a build tool. The following can be used to exec a java class:
mvn exec:java -Dexec.mainClass="com.package.Main"

Just create a class and method that creates the objects and persists the data. When you fire up your application, run the method that you created in a servlet init.You can load your servlet up with the following web.xml config.
<servlet>
<servlet-name>MyServlet1</servlet-name>
<servlet-class>com.example.MyServlet1</servlet-class>
<load-on-startup>1</load-on-startup>
</servlet>
Edit: Format web.xml to be more reader friendly.

You could model your project with Maven and write a simple test to initialize the seed data, so the only thing you will need to do is to run "mvn test".

Similar to amuniz's idea: have a look at dbunit. It is a JUnit extension for pupulating test data into a db. It uses a simple schema-less xml format for that. And running it through a test class using "mvn test" is a simple thing to do.

I would suggest liquibase http://www.liquibase.org/ for this . It has many plugin and allows you to define the rollback logic for every change set (and detect the rollback in some cases).
In this case it is important also to think about the production servers and how the seed data will be moved to production.

Related

Configuring database development environment along with Hibernate and Spring

We have a web-based application in dev phase where we use Spring 5, JPA(Hibernate) and Postgresql 9.4
Till this moment we were using one instance of the posgresql db for our work. Basically, we don't have any schema generation script and we simply were updating the db if we needed some new table, column etc. For the Hibernate we were generating classes from the db.
Now when we have some amount of test data and each change in the db brings a lot of trouble and confusion. We realized that we need to create and start maintaining some schema generation file along with some scripts which generate test data.
After some research, we see two options
Create two *.sql files. The first will contain the schema generation script the second one SQL to create test data. Then add a small module with a class which will execute the *.sql files using plain jdbc. Basically, we will continue developing and whenever we made some changes we quickly wipe->create->populate the db. This approach looks the most appealing to us at this point. It quick, simple, robust.
Second is to set up some tool which may help with that e.g. Liquibase
This approach also looks good in terms of versioning support and other capabilities. However, we are not in production yet, we are in an active development phase. We don't have much of the devs who do the db changes and we are not sure how frequently we will update the db schema in production, it could be rare.
The question is the following. Would the first approach be a bad practice and applying the second one will give the most benefits and it worth to use it?
Would appreciate any comments or any other suggestions!

First approach is NOT a bad practice, until this generation. But it will be considering the growth of tools like Liquibase.
If you are in the early or middle of the Development Phase, go ahead with LiquiBase, along with Spring Data. Contrarily, in the closing stages of the Development Phase, Think you real need for it.

I would suggest second approach as it will automatically find the new script as you add and execute the script on startup. Moreover, when you have tools available like liquibase and flyway why reinvent the wheel ?.
2nd approach will also reduce the un-necessary code for manually executing the *.sql files. Moreover this code also needs testing and if updated can be error prone.
Moreover 1st approach where you write manual code to execute script also has to check which scripts needs to be executed.. If you already has existing database and you are adding some new scripts you need to execute those new scripts only. These things are taken care of automatically with 2nd approach and you don't need to worry about already executed script being executed again
Hope this answers your concern. Happy coding

Simple embedded database with spring

How to setup a simple embedded database in a spring(data)+maven project?
I need to develop a simple graphical application that read some data files and display pretty stuff about it interactively. The data is very repetitive with a little hierarchical structure. However I still don't know how I will need to access it.
For these reasons, I want to store it in a database so that I can later use DB query to access the data with query filter. (it also seems a good idea to develop a persistent layer)
Because it is for a little application, I want to use an in-memory DB.
I am quite new to java (using proper dev framework) and database. But I worked on a project using spring, spring-data, JPA, etc... I did not really understand how it worked internally and would not be able to setup it up, but I found it very practical.
Now, I found lots of docs and tutorial on internet about that, but I didn't understand enough to know how to adapt them to my need. What (I think) I want is:
to use maven+spring
spring data (I guess) to use Entity, JpaRepository and Autowired stuff
an independent program, thus starting from a Application.main method
as little and simple dependencies as possible
an embedded DB (+fast+light if possible)
genericity is nice
What I fill to be lost with are:
where should I put what properties/xml-declaration
how are all the dependencies working together (spring, spring-data, h2, hsqldb, ...)
I found this project https://github.com/wrpinheiro/spring-jpa-embedded-db that looks to fit, but:
there are way too many dependencies that (I think) I don't need, thus don't want
I don't know how to start a program with it
I don't get the org.springframework.stereotype.Service thing
nor the javax.inject.Inject

I think that if you look at this project you can start building what you need
http://spring.io/guides/gs/accessing-data-rest/#initial
Its maven (or gradle), has enbeded db, spring-jpa and runs as a jar that starts its own tomcat server (you can change it into a war build if you want)
Also you can use this service(?) that spring provides to create the starting build for your project:
http://start.spring.io
You provide them with what you want to build and then the code and required files are generated :D
Pretty neat.

Code-first like approach in Dropwizard Migrations Liquibase

Currently I'm working on a small web service using Dropwizard, connecting to a Postgresql DB using hibernate(build in package in Dropwizard) and with a bit of Migrations(also from Dropwizard).
Coming from a .NET environment, I'm used to a code - first/centric approach.
Currently I'm looking into generating the migrations.xml from the current state of my entity class based on the JPA annotations on them.
I feel this is a case somebody might have already resolved.
Is there a way to automatically update the migrations.xml based on the classes I'm writting?

It is possible. See the liquibase-hibernate plugin at https://github.com/liquibase/liquibase-hibernate/wiki.
Make sure you look at the generated migrations.xml changes before applying them because, like any diff-based process, the schema transformation may not be what you intended and that matters with data. For example, if you rename a class it will generate a drop + create process rather than a rename operation. The result is a valid schema, but you lose data.

OpenJPA: Code to build entities automatically from DB

Hi I'm looking for a code/tool to generate entities automatically. I'm not looking for a software like eclipselink which has to be executed manually, but rather a piece of code (or a maven plugin) that can be automatically run whenever the db changes. (If I can autorun eclipselink via cron job, that would work for me.)
Some other options:
I think Hibernate offers a reverse engineering method that can be called from maven build that auto generates the entities from db schemas. Does anyone has a such a tool for openjpa.
Any command line utility where you just specify the db urls and options and the utility generates the entities. I can just write a cron to run the utility nightly etc.
Any software that can be called automatically via cron, and it generates the entity will also do.
Update:
OpenJPA Reverse mapping tool seems to really suck at generating a proper entity with annotations, mapping and so on... I would be glad if someone corrected me

Check out Reverse Mapping in the user manual. You can launch that from an ant task.

I doubt a fully automatized tool like that can exist — simply because it can't be done well without human intervention. How would, for example, the algorithm decide which attributes should be taken into account in equals() and hashCode()? Or new relations uni- or bidirectional? Lazy/eager loading? And so on.
As you know, and others have noted, the tools per se exist, but they're rather intended to run once, tweak the result, and work with it from now on, rather than be a part of a continous integration process.

Database migration pattern for Java?

Im working on some database migration code in Java. Im also using a factory pattern so I can use different kinds of databases. And each kind of database im using implements a common interface.
What I would like to do is have a migration check that is internal to the class and runs some database schema update code automatically. The actual update is pretty straight forward (I check schema version in a table and compare against a constant in my app to decide whether to migrate or not and between which versions of schema).
To make this automatic I was thinking the test should live inside (or be called from) the constructor. OK, fair enough, that's simple enough. My problem is that I dont want the test to run every single time I instantiate a database object (it runs a query so having it run on every construction is not efficient). So maybe this should be a class static method? I guess my question is, what is a good design pattern for this type of problem? There ought to be a clean way to ensure the migration test runs only once OR is super-efficient.

Have a look at liquibase.
Here's an ibm developerworks article that has a nice walk-thru http://www.ibm.com/developerworks/java/library/j-ap08058/index.html

Flyway fits your needs perfectly. It supports multiple databases, compares the schema version with the available migrations on the classpath and upgrades the database accordingly.
You can embed it in your application and have it run once on startup as described in the Flyway docs.
Note: Flyway also comes with a Maven plugin and the ability to clean an existing schema in case you messed things up in development.
[Disclaimer: I'm one of Flyway's developers]

I've been using the iBatis SQL Mapper and really like it. The next version, iBatis 3.0, has schema migrations support. This is still in beta, but I'm planning on using it when it gets closer to a release candidate.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.