HIbernate auto delete old data - java

I'm using Hibernate 4 to manage all the databases connection.
In the table i'm creating i'd like to keep only the last 24 hour data for statistic calculation.
Is there a way to automatic delete older data on the table (obvius there's a field EVENTDATA of type DATETIME) or i've to do this manually every x minute?

You could use job scheduling with cron trigger to achive this. if you use cron expression 0 0 0 * * then the delete trigger will invoke for every night at 00:00

Related

Best approach to lock editing certain record in DB

I am working on a spring boot project, the task is: I should lock editing capability of product for 15 minutes after creation, so basically if the user create a product, this product will be locked for editing for 15 minutes, after that it can be changed or deleted from the DB.
My question is: what is the best approach to achieve that:
1- Should I add a field to the DB table called lastUpdate and then check if the time of 15 minutes exceed.
2- Should I save all the newly created products in array and clear this array every 15 minutes.
or there is any better ways in regard to performance and best practice??
I am using springboot with JPA & mysql.
Thanks.
You should not use the locking available in InnoDB.
Instead, you should have some column in some table that controls the lock. It should probably be a TIMESTAMP so you can decide whether the 15 minutes has been used up.
If the 'expiration' and 'deletion' and triggered by some db action (attempt to use the item, etc), check it as part of that db action. The expiration check (and delete) should be part of the transaction that includes that action; this will use InnoDB locking, but only briefly.
If there is no such action, then use either a MySQL EVENT or an OS "cron job" to run around every few minutes to purge anything older than 15 minutes. (There will be a slight delay in purging, but that should not matter.
If you provide the possible SQL statements that might occur during the lifetime of the items, I may be able to be more specific.
you can make some check in your update method and delete method. If there are many methods, you can use AOP.
You can make use of both the functionalities you have mentioned.
First its good to have a lastUpdated field in tables which would help you in future also with other functionalities.
And then you can have an internal cache (map which has time and object reference), store objects in that and restrict editing for them. You can run a scheduler to check every minute and clear objects from you map and make them available for updating.
You could put your new products in an "incoming_products" table and put a timestamp column in that table that you set to date_add(now(), INTERVAL 15 MINUTE).
Then have a #Scheduled method in our Boot application run every minute to check if there are incoming products where the timestamp column is < now() and insert them as products and delete the corresponding incoming-products record.

Starting a new job instance with last processed record as jobParameter in Spring Batch?

We are working on a Spring Batch job. The job will run everyday for about ~6 hours and will fetch some value, corresponding to each record, from a REST service. Once the value is retrieved from the REST service, it's updated for the corresponding record. For e.g.
--------------------
Student
--------------------
Id | Name | Marks
--------------------
1 | John | Null
2 | Sam | Null
3 | Lilly| Null
Iterate over each record(ASC order) and fetch the Marks from REST service based on Id. Update the column Marks with the marks retrieved. The REST service does not support batch operations and can only handle one record at a time.
Proposed Solution:
Read data from db using a RepositoryItemReader using a fixed page size in ASC order. Since by default there is no range of Ids the job will continue to run forever( will be stopped after ~6 hours everyday).
Call a REST service to fetch marks based on each record Id and update Student object with marks. (CustomItemProcessor)
Update the student object using RepositoryItemWriter.
Problems that need to be resolved:
There are 2 problems:
1.Need to know the last processed record to resume from there (we would like
to create a new job instance everyday).
In order to run the job everyday, we can benchmark the job and estimate the
number of records it will process everyday. Based on that we can define Id
ranges in a static Table so that the job reads range from the table and
processes records within the range. This solution is not very elegant.
Another approach would be to store the last fetched( not read) in a tracking
table and use it as lower limit for the next day. I am not sure how can I
achieve this elegantly.
2.Improve performance of the job
In a single thread sequential mode, the performance is not very good. It's only
able to process 1 record/ 2 seconds ( 0.5 record/ second). I used a
ThreadPoolTaskExecutor with a thread pool size of around 10 and was able to
achieve a performance of 4 records/ sec (which is ideal for us).
Since we are also using ThreadPoolTaskExecutor, it's not straightforward to know the last processed record.
Ordering is not compatible with multi-threading. "First" and "Last" are undefined if you go parallel. You need to find a way to mark records in a way that is agnostic from a serial or a parallel execution. I highly recommend to first find a correct solution to the problem before introducing multi-threading: correctness is more important than performance.
We want to run a new Job instance everyday.
This means that the current date is a good candidate for an identifying job parameter.

Set some trigger that removes rows older than few minutes from DB with Spring Boot

I have a Java app on Spring Boot with Cassandra DB, where I'm writing to DB Person entities.
Each row of person in DB must be deleted when get 5 minute old, so the concept is easy:
Some person is added to DB with timestamp and this person must be removed after exactly 5 minutes.
The only idea that comes to mind is setting Spring Scheduler which runs every second and checks every row if it's expired and if it is, then it is deleted.
Since you are using Cassandra as a DB you could leverage the Cassandra TTL feature.
During data insertion, you have to specify 'ttl' value in seconds. 'ttl' value is the time to live value for the data. After that particular amount of time, data will be automatically removed.
TTL syntax in cql would be like
INSERT INTO person (name, age) VALUES ('ExampleName', '39') USING TTL 300;
instead of running a Spring Scheduler every second, you could easily create a timer task after each record and using TimerTask(Core java), you create a task that will execute after the set interval and delete the record.
PFB some useful link and example:
https://www.baeldung.com/java-timer-and-timertask

trigger java function on mysql datetime

I have several datetime column in my MySQL DB. I want to trigger a java function when the date is reached. At worst, trigger a MySQL function can do the job as well. How to have a trigger datetime based on MySQL without doing cron job on every minute ?
Even a trigger wouldn't do the job, there must be a process to check (in your case if the date was reached)
Like Thomas said job or a task (CRON) that sets the trigger or an application to do what you wish with the database.
It is not ideal to do this in database, but if no better choice, you can achieve it by creating a MySQL event, which is a scheduled task.
You need to add a insert and/or update trigger to the database table and create the event based on the datetime value of the column
You can create the event in the way that drops itself after it is executed at the specified time.

PostgreSQL: How to make updates readable from other transactions while first transaction not finished

I do simple schedule service.
One table with jobs
CREATE TABLE system_jobs (
id bigserial,
job_time timestamp without time zone,
job_function text,
run_on text,
CONSTRAINT system_jobs_pri PRIMARY KEY (id)
)
Multiple JAVA daemons select all rows where job_time < now() and execute job_function (pass their id as argument)
job_function sample
CREATE OR REPLACE FUNCTION public.sjob_test(in_id bigint)
RETURNS text AS
$BODY$DECLARE
utc timestamp without time zone;
BEGIN
utc := timezone('UTC', now());
-- This changes not avail from other transactions
UPDATE system_jobs SET run_on='hello' WHERE id = in_id;
PERFORM pl_delay(60); -- Delay 1 minute
UPDATE system_jobs SET job_time = now() + interval '10 seconds', run_on = '' WHERE id = in_id;
RETURN 'done';
END;$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
Inside function before 60 seconds delay I update run_on field and reset it after delay.
I expect run_on contains 'hello' while delay (60 sec) and will available for reading from other transactions, but it is not.
My task - prevent execute same job_function by different JAVA daemons simultaneous. I want check run_on before execute.
I read many docs and blogs about transaction levels, but I don't understand how can I use it in practice.
How can I configure my function or table or external process to allow other transaction see this changes?
PostgreSQL doesn't support dirty reads. See PostgreSQL documentation:
PostgreSQL's Read Uncommitted mode behaves like Read Committed. This is because it is the only sensible way to map the standard isolation levels to PostgreSQL's multiversion concurrency control architecture.
But looks like there is a workaround, called autonomous transactions, than might help you. There are at least two ways to implement it. See more info here and here.
Using these autonomous transactions, you can commit the change of run_on inside your function, so other transactions will be able to read it.
Only one way to do this - via dblink. Something like:
PERFORM dblink('your server config', 'UPDATE ...');

Categories

Resources