Java web application storing large objects available best options

Java web application storing large objects available best options - java

Java - spring - webapplication
I have a web application which has wizard based processes to create complex entity and, there are atleast 10 screens to complete one process but problem is at any step between 1 to 10 user can come out without completeting the process and we want to store that data so that when user want to resume process it should be able to do, there are multiple tables involved in this process.
I am worried about saving data into database on every wizard step cuz after some time data will become clustered and orphan into the database and it will become garbage.
I wana discuss the solution with you guys, please advise.

You could serialize the data to XML or JSON and store it somewhere on the DB temporarily. This would avoid dealing with multiple tables. You can use a timeout and remove those entries after a while (some days maybe). Once completed do the real save and remove the temp data on success.

after some time data will become clustered and orphan into the
database
eh ?
Delete the temporary/incomplete data when the user has successfully completed the process.

Related

Should I use AKKA for the periodical task

I have a terminal server monitor project. In the backend, I use the Spring MVC, MyBatis and PostgreSQL. Basically I query the session information from DB and send back to front-end and display it to users. But there is some large queries(like searching total users, total sessions, etc.), which slow down the system when user opens the website, So I want to do these queries as asynchronous tasks so the website could be opened fast rather than waiting for the query. Also, I would check terminal server state periodically from DB(every hour), and if terminal server fails or average load is too high, I would notifying admins. I do not know what should I use, maybe AKKA, or any other way to do these two jobs(1.do the large query asynchronously 2. do some periodical query)? Please help me, thanks!

You can achieve this using Spring and caching where necessary.
If the data you're displaying is not required to be "in real-time", but it can be "near real-time" you can read the data from the DB periodically and cache it. Your app then reads from the cache.
There's different approaches you can explore.
You can try to create a materialized view in PostgreSQL which will hold the statistic data you need. Depending on your requirements you have to see how to handle refresh intervals etc.
Another approach is to use application level cache - you can leverage Spring for that(Spring docs). You can populate the cache on start up and refresh it as necessary.
The task that runs every hour can be implemented again leveraging Spring (Spring docs) #Scheduled annotation.
To answer your question - don't use Akka - you have all the tools necessary to achieve the task in the Spring ecosystem.

Akka is not very relevant here, it is for event-driven programming model which deals with concurrency issues to build highly scalable multithreaded applications.
You can use Spring task scheduler for running heavy queries periodically. If you want to keep it simple, you can solve your problem by simply storing the data like total users, total sessions etc, in the global application context. And periodically update this data from database using spring scheduler. You can also store the same in a separate database table, so that this data can be easily loaded at the initialization time.
I really don't see why you need "memcached", "materialized views", "Websockets" and other heavy technologies and frameworks, for a caching a small set of data. All you need is maintain a set of global parameters in your application context, keep them updated using a scheduled task as frequently as desired.

Real time data consumption from mysql

I have a use case in which my data is present in Mysql.
For each new row insert in Mysql, I have to perform analytics for the new data.
How I am currently solving this problem is:
My application is a Spring-boot application, in which I have used Scheduler which checks for new row entered in the database after every 2 seconds.
The problem with the current approach is:
Even if there is no new data available in Mysql table, Scheduler fires MySQL query to check if new data available or not.
One way to solve this type of problem in any SQL database in Triggers .
But till now I am not successful in creating Mysql triggers which can call Java-based Spring application or a simple java application.
My question is :
Is their any better way to solve my above use-case? Even I am open to change to another storage (database) system if they are built for this type of use-case.

This fundamentally sounds like an architecture issue. You're essentially using a database as an API which, as you can see, causes all kinds of issues. Ideally, this db would be wrapped in a service that can manage the notification of systems that need to be notified. Let's look at a few different options going forward.
Continue to poll
You didn't outline what the actual issue is with your current polling approach. Is running the job when it's not needed causing an issue of some kind? I'd be a proponent for just leaving it unless you're interested in making a larger change.
Database Trigger
While I'm unaware of a way to launch a java process via a db trigger, you can do an HTTP POST from one. With that in mind, you can have your batch job staged in a web app that uses a POST to launch the job when the trigger fires.
Wrap existing datastore in a service
This is, IMHO, the best option. This allows there to be a system of record that provides an API that can be versioned, etc. This would allow any logic around who to notify would also be encapsulated into this service.
Replace data store with something that allows for better notifications
Without any real information on what the data being store is, it's hard to say how practical this is. But using something as Apache Kafka or Apache Geode would both be options that provide the ability to be notified when new data is persisted (Kafka by listening to the topic, Geode via a continuous query).
For the record, I'd advocate for the wrapping of the existing database in a service. That service would be the only way into the db and take on responsibility for any notifications required.

How to run very long process in a java based web application?

I need to run a very long process in a java based spring boot web application. The process consists of following steps.
Get details for about 3,00,000 users from the database.
Iterate over them.
Generate PDF file for each user using itext.
Save PDF file on the filesystem.
Update the database that the PDF file for the given user has been created.
Update the PDF path for the user in the database.
Now, this entire process can take lots of time. May be lots of hours or even may be days as it consist of creating pdf file for each user, then lots of db updates.
Also, I need this process to run in background so that the rest of the web application can run smoothly.
I am thinking of using Spring Batch or Messaging Queue. Haven't really used any of them, so not sure if they are proper frameworks for such kind of problem or which one of these two is best fit for the problem.
What is the ideal way to implement such kind of tasks?

If you can't name a requirement you expect to be satisfied by a framework / library you most likely won't need one...
Generating PDFs might need a lot of power, you might want to keep this background process away from your main web application on it's own machines.
If it's a simple java process it's usually easier to control and to move it around your environment.
To me this looks like a simple task for "plain" java - KISS. Or am I missing something?
I'd make sure the Finder used to fetch the users from the database is
restartable, i.e. only fetches unprocessed users (in case you have to stop the processing because shit happens:-)
runs in batches to keep the db round trips and load low
is multi threadable i.e. can fetch users split into a given number of threads (userid mod numberOfThreads, assuming userId is evenly distributed) so you can add more machines / threads if necessary.

You should use spring batch for this process. When the user presses the button, you would launch the job asynchronously. It will then run in a separate thread and process all your records. The current status of the job can be obtained from the job repository. Spring batch is made for this type of processing.

How to synchronize data between MongoDB and OpenLDAP databses

I have two data bases for one system. One is OpenLDAP and another one is MongoDB. To be specific this OpenLDAP is used by Atlassian Crowd that is used by us. I need to synchronize users in these two databases. That is,
If I create a user it will be defaultly created in the OpenLDAP and it has to be created in the MongoDB as well.
In past there were issues in handling this and there may be users who are in OpenLDAP but not in MongoDB. I need to find these users also.
If I delete or update a user from one I need the delete or operation to happen in both DBs.
I am going to have a cache copy of LDAP using Redis. What is the best way to synchronize data between these two databases to match the above expectations?
If it helps I am using Java in backend.

2 possible ways:
(Preferred) Design your code in a way you can "plug" database operators to handle the different databases, so you access them from a facade code that lets you access it without worriying the underlaying databases. , so creating an user, for example, would be something like this:
createUser() -> foreach dbhandle do dbhandle->createUser() forend
The same applies to delete or update any data. This approache should also solve the problem 2.
You can just update one database and have a script that runs in background updating the databases. This approach will let you work just with 1 database, letting the script handle the rest of the databases, but it is way more expensive and less reliable (as you might access 1 database that has not been updated from the master database yet)

database sharing between two servers

My current set up is a single dedicated server with Java, hibernate app running on tomcat, apache http server, MYSQL.
I need to get a second server to share the load, but using the same database from the first server.
The backend processing(excluding db transaction) is time consuming, hence the second server for backend processing).
Will there be any unwanted consequences of this setup? Is this the optimal setup?
My apps do update/delete and has transaction control as follows:
beginTransaction();
getSession().save(obj);
//sessionFactory.openSession().save(obj);
commitTransaction()

As long as only one of the apps does database updates on a shared table you should be fine. What you definitely don't want to happen is:
app1: delete/update table1.record24
app2: delete/update table1.record24
because when Hibernate writes the records one of the processes will notice the data has changed and throw an error. And as a classic Heisenbug it's really difficult to reproduce.
When, on the other hand, the responsibilities are clearly separated (the apps share data for reading, but do not delete/update the same tables) it should be ok. Document that behavior though as a future upgrade may not take that into account.
EDIT 1:Answering comments
You overcome concurrency issues by design. For any given table:
Both apps may insert
Both apps may select
one of the apps may also update / delete in that table
Your frontend will probably insert into tables, and the backend can read those tables,
update rows where necessary, create new result rows, and delete rows as cleanup.
Alternatively, when the apps communicate, the frontend can transfer ownership of the records for a given task to the business backend, which gives the ownership back when finished. Make sure the hibernate cache is flushed (transaction is executed) and no hibernate objects of that task are in use before transferring ownership.
The trick of the game is to ensure that Hibernate will not attempt write records which are changed by the other app, as that will result in a StaleStateException.
And example of how I solved a similar problem:
app 1 receives data, and writes it in table1
app 2 reads table1, processes it, and writes/updates table2
app 2 deletes the processed records in table1
Note that app 1 only writes to the shared table. It also reads, writes and updates from other tables, but those tables are not accessed by app 2, so that's no problem.

It is a fairly common approach, both for failover and load balancing.
Here's a short article describing the setup:
http://raibledesigns.com/tomcat/
Beware of singletons in this setup.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.