I am thinking about possibe ways of migrating users between two systems. It is financial system, a web application (Java, Spring, Hibernate, Oracle, JBoss, etc). There are 500k users to migrate together with their data like accounts, contractors, transfers, many others. New application is already working and has 10k users.
Currently I am only thinking about online / offline migration. Online means, that the applications stays accessible to its users during migration, offline means that I turn it off and display technical break message and the migrations is processing. The client doesn't want to turn the application off so it would mean online migration. When the application is on (accessible to its users), many users may perform different operations (also using externals systems) and many background processes are working and changing database data. It would be quite risky to migrate online:
- no database backup as during migration many users would do different things in application, so there would be no point of return,
- problems during migration could block users that are online (database, transaction locks etc).
And maybe you have some strong points to convince my customer that online migration is nonsense? They propably could be divided by the application layer - the Jboss server risks, database risks, business risks..
You're going to have to put your storage (database) in read-only mode during the migration. The application won't be down but some features that alter the data won't be available. During the read-only mode time you copy the data over the new primary site. Once all that data is copied the users must be redirected to the new site and the application becomes read-write again.
If a read-only mode isn't acceptable then you'll have to maintain two databases in sync. Products such as GoldenGate can do that for you.
Related
I have a SpringBoot application running on a multitenant architecture.
I've two databases Admin and Client (both are MySQL) and both these databases have a User table
Client can add users to the User Table but I need them to get synchronized in the User table of Admin database.
Is there a way I can achieve this?
I've read about flyway migrations but I think it works more on database schema changes and not values.
Please ignore my mistakes as this my first question, any help would be appreciated.
This looks like a solution to your problem:
SymmetricDS is software that replicates relational database tables between multiple databases. It can also be used to replicate files and directories between multiple hosts. It uses a light-weight, web-based protocol to send and receive data, which makes it easy to work with firewalls. Replication is done in the background asynchronously, allowing data changes in offline mode. It supports most commercial and open source database platforms.
How does it work?
Triggers are installed in the database to guarantee that data changes are captured. This means that applications continue to use the database as usual without any special driver software. The triggers are written to be as small and efficient as possible. Routing and syncing of data is done outside of the database in the SymmetricDS process.
SymmetricDS supports many databases and can replicate across different databases, including Oracle, MySQL, MariaDB, PostgreSQL, MS SQL and many more.
https://www.symmetricds.org/docs/faq
You need to create some event from the flow where client adds user to the User Table.
If this "client" flow is in same java service then you can make use of Spring's Asynchronous Event Handling or have a method(which does the data copy) marked with #Async. This ensures the data copy happens in separate thread.
If the "client" flow is in different java service, then any publisher-subscriber model can be used (some opensource frameworks available are kafka, rabbitmq etc).
Now to connect to two datasources at the same time, Spring's RoutingDataSource will come handy in this scenario as it works on "lookup key" to choose the datasource. Or else you can hardcode two datasource beans in your config (since it is fixed in your case).
We have an Oracle database which hold data about some cities and
places, etc.
We have a web system which we can manipulate these datas.
We also have a desktop client application which is working with these
data.
For increasing our desktop application performance and decreasing unuseful request for our DAO layer, we have implemented some Singleton classes in our desktop application to fetch mentioned cities, places, etc data only once right after the user is opened his/her desktop application.
Recently we received a request from our clients why we don't see the changes we make using the web application, when the client desktop application is live and up and running. They're complaining about why they have to close the desktop app and open it again in order to see the changes.
We know that the problem is those Singleton classes but we don't want to change them because it's gonna be huge overhead in our system when they're not there. For solving the problem we have thought about multiple solutions:
Create a table in a database with integer column names similar to our data columns (cities, places, etc) and auto increment value when there's an update for tracking the changes using it (a light weight solution)
Using database functionalities
a Notify system that notify the client application whenever an update occurred.
a caching mechanism inside database that cache those lately changing tables and service our users when they have similar request
Here are our stacks:
Our Desktop application is swing application
Our Web application is JSF
Our business layer for both JSF and swing is EJB
Our Dao layer for both JSF and swing is Eclipse-Link
What do you think is the best practice for solving this problem ?
Oracle has a feature called "Database Change Notification" that can be used to be notified when read-mostly tables are changed. It looks like this feature could be a good fit to address your requirement. The link to the doc is here.
In a nutshell, the way it works is that JDBC thin driver in your desktop application would open a port and the Oracle Database would connect to that port and use this connection to push notifications when data changes. You then get a callback through an event/listener API and can refresh your cache.
This notification mechanism is designed for data that is read-mostly, in other words, data that doesn't constantly change otherwise it wouldn't be worth caching the data anyway.
I'm working on a school project where the client needs to have multiple users querying and writing to a single data source. The users have access to shared network drives and all functionality has to be in the client application, the IT department won't allow a service to run from one of their servers and external server hosting isn't an option.
The amount of data that actually needs to be stored is actually very little, about 144 rows maximum.
I've looked into using embedded databases, sqllite , hsql , objectdb ... etc but they seem over kill for how little data needs to be saved. It also seemed like with hsql if anyone accessed the database it would be completely locked to any other user. Concurrency wouldn't be much of an issue there will be 5-7 people using the system albeit scarcely only a few times a year.
Would using something like XQuery and serializing everything in xml be a viable option or just simply using the java serializable api?
A distributed, client side database writing files to the shared network drive could be a good solution for this use case. Take a look at Cloud DB, it might be what your looking for.
Does the term 'embedded database' carry different meaning from 'database'?
There are two definitions of embedded databases I've seen:
Embedded database as in a database system particularly designed for the "embedded" space (mobile devices and so on.) This means they perform reasonably in tight environments (memory/CPU wise.)
Embedded database as in databases that do not need a server, and are embedded in an application (like SQLite.) This means everything is managed by the application.
I've personally never seen the term used exactly as Wikipedia defines it, but that's probably my fault, although it resembles quite a bit my number 2 above.
The word 'embedded' does add meaning, basically that the database is dedicated to a specific application rather than shared among multiple applications, to a degree hidden from the user of the application, and completely controlled by the application.
An embedded database is conceptually just a part of the application rather than a separate thing.
Just see the usage of ... for example a H2-embedded database. You don't need a server running on your machine, your whole database ist stored in one (these are originally two) local file. It is opened and locked when you connect to your DB, and it is unlocked when you disconnect.
When a developer embeds a database library inside an application and there is no need for administrator, it is called embedded database. Database is hidden, but data management via SQL (e.g. ITTIA DB SQL) or no SQL (e.g. Berkeley DB) is accessible through APIs. Embedded databases are common for web development or device applications.
In my web application I have a part which needs to continuously crawl the Web, process those data and present it to a user. So I was wondering if it is a good approach to split it up into two separate applications where one would do the crawling, data processing and store the data in the database. And the other app would be a web application (mounted on some web server) which would present to a user the data from the database and allow him a certain interaction with the data.
The reason I think I need this split is because if I make certain changes to my web app (like adding new functionalities, change the interface etc.) I wouldn't like the crawling to be interrupted.
My application stack is Tapestry (web layer), Spring, Hibernate (over MySQL) and my own implementation of the crawler independent from the others.
Is it good for the integration to be done just by using the same database? This might cause an issue with accessing the database from the both applications at the same time. Or can the integration be done on the Hibernate level, so both applications could use the same Hibernate session? But can the app from one JVM instance access the object from another JVM instance?
I would be grateful for any suggestions regarding this matter.
UPDATE
The user (from web app's interface) would enter the URLs for crawler to parse. The crawler app would just read the tables with URLs the web app populates. And vice versa, the data processed by the crawler would just be presented on the user interface. So, I think I shouldn't concern about any kind locking, right?
Thanks,
Nikola
I would definitely keep them separated like you are planning. The web crawling is more a "batch" process than a request driven web application. The web crawling app will run in its own JVM and your web app will be running in a servlet/Java EE container.
How often will the crawler run or is it a continuously running process? You may want to consider the frequency based on your requirements.
Will the users from web app be updating the same tables that the crawler will post data to? In that case you will need to take precaution otherwise a potential deadlock may arise. If you want your web app to auto refresh data based on new inserts in the tables then you can create a message driven bean (using JMS) to asynchronously notify the web app from the crawler app. When a new data insert message arrives you can either do a form submit on your page or use ajax to update the data on the page itself.
The web app should use connection pooling and the batch app could use DBCP or C3P0. I am not sure you gain much benefit by trying to share the database sessions in this scenario.
This way you have the integration between the two apps while not slowing down each other waiting on other to process.
HTH!
You are right, splitting the application into two could be reasonable in your case.
Disadvantages of separating into two applications -
You can not cache in Hibernate or any other cached mutable objects that are modifiable from both applications in any one of them. Optimistic locking should work fine with two hibernate applications. I don't see any other problems.
Advantages you have already specified in your code.