Concurrency issue in database operations in vertx - java

I have to insert two attributes into a table(device_id, timestamp) but before this, I have to delete previous day's records and perform select count to get total count of records from the table.
Based on the count value, data will be inserted in the table.
I have a total of 3 queries which works fine for single user testing but if run a concurrency test with 10 users or more, my code is breaking.
I am using hsqldb and vertx jdbc client.
Is there a way to merge all three queries?
The queries are :
DELETE FROM table_name WHERE timestamp <= DATE_SUB(NOW(), INTERVAL 1 DAY)
SELECT COUNT(*) FROM table_name WHERE device_id = ?
INSERT into table_name(device_id,timestamp) values (?,?)

You need to set auto-commit to false and commit after the last statement.
If the database transaction control is the default LOCKS mode, you will not get any inconsistency, because the table is locked by the DELETE statement until the commit.
If you have changed the transaction control to MVCC, then it depends on the way you use the COUNT in the INSERT statement.

Related

Is the order of the rows in the ResultSet constant for the same SQL query via JDBC on the same state of DB data?

Trying to write a job that executes SQL query in Java using JDBC drivers (the DB vendors can be either Oracle, DB2 or Postgres).
The query does not really matter. Let’s say it filters on certain values in few columns in 1 DB table and the result is few thousand rows.
For each row in the ResultSet I need to do some logic and sometimes that can fail.
I have a cursor position so, I “remember” last successfully processed row position.
Now I want to implement a “Resume” functionality in case of failure in order not to process again the entire ResultSet.
I went to JDBC spec of Java 8 and found nothing about the order of the rows (is it the same for the same query on the same data or not)?
Also failed to find anything in DB vendors specs.
Anyone who could hint where to look for the answer about row order predictability?
You can guarantee the order of rows by including an ORDER BY clause that includes all of the columns required to uniquely identify a row. In fact, that's the only way to guarantee the order from repeated invocations of a SELECT statement, even if nothing has changed in the database. Without an unambiguous ORDER BY clause the database engine is free to return the rows in whatever order is most convenient for it at that particular moment.
Consider a simple example:
You are the only user of the database. The database engine has a row cache in memory that can hold the last 1000 rows retrieved. The database server has just been restarted, so the cache is empty. You SELECT * FROM tablename and the database engine retrieves 2000 rows, the last 1000 of which remain in the cache. Then you do SELECT * FROM tablename again. The database engine checks the cache and finds the 1000 rows from the previous query, so it immediately returns them because in doing so it won't have to hit the disk again. Then it proceeds to go find other 1000 rows. The net result is that the 1000 rows that were returned last for the initial SELECT are actually returned first for the subsequent SELECT.

Batching "UPDATE vs. INSERT" Queries Against Oracle Database

Let's assume that I have an Oracle database with a table called RUN_LOG I am using to record when jobs have been executed.
The table has a primary key JOB_NAME which uniquely identifies the job that has been executed, and a column called LAST_RUN_TIMESTAMP which reflects when the job was last executed.
When an job starts, I would like to update the existing row for a job (if it exists), or otherwise insert a new row into the table.
Given Oracle does not support a REPLACE INTO-style query, it is necessary to try an UPDATE, and if zero rows are affected follow this up with an INSERT.
This is typically achieved with jdbc using something like the following:
PreparedStatement updateStatement = connection.prepareStatement("UPDATE ...");
PreparedStatement insertStatement = connection.prepareStatement("INSERT ...");
updateStatement.setString(1, "JobName");
updateStatement.setTimestamp(2, timestamp);
// If there are no rows to update, it must be a new job...
if (updateStatement.executeUpdate() == 0) {
// Follow-up
insertStatement.setString(1, "JobName");
insertStatement.setTimestamp(2, timestamp);
insertStatement.executeUpdate();
}
This is a fairly well-trodden path, and I am very comfortable with this approach.
However, let's assume my use-case requires me to insert a very large number of these records. Performing individual SQL queries against the database would be far too "chatty". Instead, I would like to start batching these INSERT / UPDATE queries
Given the execution of the UPDATE queries will be deferred until the batch is committed, I cannot observe how many rows are affected until a later date.
What is the best mechanism for achieving this REPLACE INTO-like result?
I'd rather avoid using a stored procedure, as I'd prefer to keep my persistence logic in this one place (class), rather than distributing it between the Java code and the database.
What about the SQL MERGE statement. You can insert large number of records to temporary table, then merge temp table with RUN_LOG For example:
merge into RUN_LOG tgt using (
select job_name, timestamp
from my_new_temp_table
) src
on (src.job_name = tgt.job_name)
when matched then update set
tgt.timestamp = src.timestamp
when not matched then insert values (src.job_name, src.timestamp)
;

Partial Data deletes using Java SQL Hibernate

Our Java program has to delete large number of records from DB2 tables but it is running out of Transaction logs.
We are working on increasing the logs space but due to internal processes and other things... it will take a week+ to complete.
We are looking for a temporary way to delete a few records at a time instead of deleting all. For e.g When we have 1000 records to delete, we want the program to delete 50 records at a time and then do a commit and proceed for next 50 until all 1000 records deleted. If delete fails after X number of records deletion, it is still fine.
We are using Hibernate.
Requesting your suggestion on how it can be achieved. I'm looking at checking the sqlstate and sqlcode in java even on successful sql execution scenario but my bad, i couldn't find a way.
so like, loop do... while(check sqlcode for completion not true)
We cannot delete from back-end as the deletions are supposed to happen on user requests from java web application and in addition, we also have some table constraints and delete cascades.
I don't know Hibernate but you can change your delete to:
delete from (
select * from schema.table where id=:id fetch first 50 rows only
)
Assuming executeUpdate() returns the number of rows deleted you can put this in a loop (pseudocode):
... stmt = session.createSQLquery("delete from ( ... fetch first 50 rows only )");
int n = 1
while (n > 0) {
transaction = session.beginTransaction();
n = stmt.executeUpdate();
transaction.commit();
}

How can I avoid changes in DB while making a set of queries?

In my web application (Java + Spring + JPA) I have a method executing a set of related queries. In particular I need to know the total row count of a table and the row count of the result set for a certain query.
Obviously between these two queries changes can happen in my table: a new row added, row removed, field value changed, etc.
Table has millions of rows, so it's impossible to load the whole table in memory and make filtering in application context.
So I need to find a way to execute a set of queries maintaining the same "state" for the table (some kind of snapshot).
Is it sufficient to execute queries inside the same transaction, or is there some other approach?
UPDATE
The method is used for table pagination. I need to show n rows (PAGE) taken from m pages (SEARCH) filtered from a a total of t existing rows (TOTAL).
So basically I need to extract n records and to provide two numeric info: filtered rows count and total rows count.
I can execute SELECT count(*) from table, then SELECT count(*) from table where <search criteria> and then SELECT * from table where <search criteria> limit <n>, but I must be sure that no change appens in between...
I'm using MySQL 5
My tests in MySQL 5.5.28 show that you could rely on the fact simply being in a transaction won't count any new inserted/deleted/changed rows (being in a REPEATABLE READ transaction isolation level). This means, that COUNT() is subject to the transaction isolation levels. According to the documentation this mode is exactly what you want: a snapshot-based read based on the first read.

Updating a database while using a preparedStatement select

I'm selecting a subset of data from a MS SQL datbase, using a PreparedStatement.
While iterating through the resultset, I also want to update the rows. At the moment I use something like this:
prepStatement = con.prepareStatement(
selectQuery,
ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_UPDATABLE);
rs = prepStatement.executeQuery();
while(rs.next){
rs.updateInt("number", 20)
rs.updateRow();
}
The database is updated with the correct values, but I get the following exception:
Optimistic concurrency check failed. The row was modified outside of this cursor.
I've Googled it, but haven't been able to find any help on the issue.
How do I prevent this exception? Or since the program does do what I want it to do, can I just ignore it?
The record has been modified between the moment it was retrieved from the database (through your cursor) and the moment when you attempted to save it back. If the number column can be safely updated independently of the rest of the record or independently of some other process having already set the number column to some other value, you could be tempted to do:
con.execute("update table set number = 20 where id=" & rs("id") )
However, the race condition persists, and your change may be in turn overwritten by another process.
The best strategy is to ignore the exception (the record was not updated), possibly pushing the failed record to a queue (in memory), then do a second pass over the failed records (re-evaluating the conditions in query and updating as appropriate - add number <> 20 as one of the conditions in query if this is not already the case.) Repeat until no more records fail. Eventually all records will be updated.
Assuming you know exactly which rows you will update, I would do
SET your AUTOCOMMIT to OFF
SET ISOLATION Level to SERIALIZABLE
SELECT row1, row1 FROM table WHERE somecondition FOR UPDATE
UPDATE the rows
COMMIT
This is achieved via pessimistic locking (and assuming row locking is supported in your DB, it should work)

Categories

Resources