Bulk Update DB2 using Hibernate and Multi Threading - java

I have a requirement to update more than 1000000 records in DB2 database.
I tried using hibernate with multi threaded application updating the records. However, on doing so I was getting lockacquisitionexception. I feel it's because of the bulk commits I am doing along with multiple threads.
Can someone recommend a better solution or better way to do so.
Please let me know if I need to upload the code I am using.
Thanks in advance.
//Code running multiple times with threads
Transaction tx = null;
tx = session.beginTransaction();
for(EncryptRef abc : arList) {
String encrypted = keyUtils.encrypt(abc.getNumber()); //to encrypt some data
Object o = session.load(EncryptRef.class,new Long(abc.getId())); //primary key EncryptRef object = (EncryptRef)o;
object.setEncryptedNumber(encrypted); //updating the row
}
tx.commit(); //bulk commiting the updates
Table contains just three columns. ID|PlainText|EncryptedText
Update:
I tried batch updates using JDBC prepared statemenets. However, I am still facing the below exception:
com.ibm.db2.jcc.am.BatchUpdateException:
[jcc][t4][102][10040][3.63.75] Batch failure. The batch was
submitted, but at least one exception occurred on an individual member
of the batch. Use getNextException() to retrieve the exceptions for
specific batched elements. ERRORCODE=-4229, SQLSTATE=null at
com.ibm.db2.jcc.am.fd.a(fd.java:407) at
com.ibm.db2.jcc.am.n.a(n.java:386) at
com.ibm.db2.jcc.am.zn.a(zn.java:4897) at
com.ibm.db2.jcc.am.zn.c(zn.java:4528) at
com.ibm.db2.jcc.am.zn.executeBatch(zn.java:2837) at
org.npci.ThreadClass.run(ThreadClass.java:63) at
java.lang.Thread.run(Thread.java:748)
Below is the code executed with batch size of 50-100 records:
String queryToUpdate = "UPDATE INST1.ENCRYPT_REF SET ENCR_NUM=? WHERE ID=?";
PreparedStatement pstmtForUpdate = conn.prepareStatement(queryToUpdate);
for (Map.Entry<Long,String> entry : encryptMap.entrySet()) {
pstmtForUpdate.setString(1, entry.getValue());
pstmtForUpdate.setLong(2, entry.getKey());
pstmtForUpdate.addBatch();
}
pstmtForUpdate.executeBatch();
conn.close();

Without knowing anything about your database structure it’s hard to recommend a specific solution. If you can change the database, a good strategy would be to partition your table and then arrange for each thread to update a separate partition. Instead of having multiple threads updating one large database and conflicting with each other, you would effectively have each thread each updating its own smaller database.
You should also make sure you’re effectively batching updates and not committing too often.
If your table has tons of indexes, it might be more efficient to drop some/all and rebuild after your update than to update then on an ongoing basis. Similarly you might consider removing triggers, referential integrity constraints, etc., then patching up later.

Not an answer to the question. Used for better formatting.
To catch the actual db2 SQLCODE use the following technique. Otherwise it's impossible to understand the root cause of the problem.
try {
...
} catch (SQLException ex) {
while (ex != null) {
if (ex instanceof com.ibm.db2.jcc.DB2Diagnosable) {
com.ibm.db2.jcc.DB2Diagnosable db2ex =
(com.ibm.db2.jcc.DB2Diagnosable) ex;
com.ibm.db2.jcc.DB2Sqlca sqlca = db2ex.getSqlca();
if (sqlca != null) {
System.out.println("SQLCODE: " + sqlca.getSqlCode());
System.out.println("MESSAGE: " + sqlca.getMessage());
} else {
System.out.println("Error code: " + ex.getErrorCode());
System.out.println("Error msg : " + ex.getMessage());
}
} else {
System.out.println("Error code (no db2): " + ex.getErrorCode());
System.out.println("Error msg (no db2): " + ex.getMessage());
}
ex = ex.getNextException();
}
...
}
As for ENCR_NUM field.
Is it possible to have actual values for this column outside of your application?
Or are such values can be generated by your application only?
Do you have to update all the table rows or is there some condition on the set of IDs which need to be updated?

Related

Safe data update in mySQL / Java

Here I have a dilemma.
Let's imagine that we have a sql table like this
enter image description here
It could be a problem when two or more users overwrite data in the table.
How should I check if the place hasn't been taken before update data?
I have two options
in SQL query:
UPDATE ticket SET user_user_id = ? WHERE place = ? AND user_user_id is NULL
or in Service layer:
try {
Ticket ticket = ticketDAO.read(place)
if (ticket.getUser() == null) {
ticket.setUser(user)
ticketDAO.update(ticket)
}else {
throw new DAOException("Place has been already tooken")
}
What way is safer and commonly used in practice?
Please, share your advice.
Possible approach here is to go ahead with SQL query. After query execution check number of rows modified in ticketDAO.update method. If 0 rows modified then throw exception DAOException.

DynamoDB wait for table to become active

I am working on a project where we are using dynamoDB as the database.
I used the TableUtils of import com.amazonaws.services.dynamodbv2.util.TableUtils;
to create table if it does not exist.
CreateTableRequest tableRequest = dynamoDBMapper.generateCreateTableRequest(cls);
tableRequest.setProvisionedThroughput(new ProvisionedThroughput(5L, 5L));
boolean created = TableUtils.createTableIfNotExists(amazonDynamoDB, tableRequest);
Now after creating table i have to push the data once it is active.
I saw there is a method to do this
try {
TableUtils.waitUntilActive(amazonDynamoDB, cls.getSimpleName());
} catch (Exception e) {
// TODO: handle exception
}
But this is taking 10 minutes.
Is there a method in TableUtils which return as soon as table becomes active.
You may try something as follows.
Table table = dynamoDB.createTable(request);
System.out.println("Waiting for " + tableName + " to be created...this may take a while...");
table.waitForActive();
For more information check out this link.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AppendixSampleDataCodeJava.html
I had implemented the solution for this in GO language.
Here is the summary.
You have to use an API - DescribeTable or corresponding API.
The input to this API will be DescribeTableInput, where you specify the table name.
You will need to do polling in a loop till the table becomes active.
The output of the Describe table will provide you status of the table ( result.Table.TableStatus)
If the status is "ACTIVE" then you can insert the info. Else you will need to continue with the loop.
In my case, the tables are becoming active in less than one minute.

How to efficiently save value to many duplicated tables?

I have an entity named Message with fields: id (PK), String messageXML and Timestamp date. and simple dao to store object into Oracle Database (11g) / MyBatis
Code looks like something like that:
Sevice:
void process throws ProcessException {
Message message = wrapper.getMessage(request);
Long messageId;
try {
messageId = (Long) dao.save(message);
} catch (DaoException e) {
throw ProcessException(e);
}
Dao
private String mapperName = "messageMapper";
Serializable save(Message message) throws DaoException {
try {
getSqlSession().insert(mapperName + ".insert", message);
return message.getPrimaryKey();
} catch (Exception e) {
throw DaoException(e);
}
Simple code. Unfortunately, load of this method process(req) is about 500 req / sec. and sometimes I get a lock on DB during saving message.
To resolve that problem I thought about multiplication table Message, for instance I will be have five table Message1, Message2 ... Message 5 and during saving entity Message i will be drawing (like a round robin algorithm) table - for instance:
private Random generator;
public MessageDao() {
this.generator = new Random();
Serializable save(Message message) throws DaoException {
try {
getSqlSession().insert(getMapperName() + ".insert", message);
return message.getPrimaryKey();
} catch (Exception e) {
throw DaoException(e);
}
private String getMapperName() {
return this.mapperName.concat(String.valueOf(generator.nextInt(5))); //could be more effeciency of course
}
What are you thinking about this solution? Could be efficiently? How can I make that better? Where could I make bottleneck?
Reading between the lines, I guess you have a number of instances of code running serving multiple concurrent requests, hence why you are getting the contention. Or you have 1 server that is firing 500 requests per second and you experience waits. Not sue which of these you mean. In the former case, you might want to look extent allocation - if the table/index next extent sizes are small you will see regularly latency when Oracle grabs the next extent. Size too small and you will get this latency very regularly, size big and when it does eventually run out the wait will be longer. You could do something like calculate the storage per week, and have a weekly procedure to "Grow" the table/indexes accordingly to avoid this during operation hours. I would be tempted to examine the stats and see what the waits are.
If however the cause is concurrency (maybe in addition to extent management), then you're probably getting hot-block contention on the index used to enforce the PK constraints. Typical strategies to mitigate this include REVERSE index (no code change required), or more controversially use partitioning with a weaker unique constraint by adding a simple column to further segregate the concurrent sessions. E.g. add a column serverId to the table and partition by this and the existing PK column. Assign each application server a unique serverId (config/startup file). Amend the insert to include the serverID. Have 1 partition per server. Controversial because the constraint is weaker (down to how partitions work), and this will be an anathema to purists, but this is something I've used on projects with Oracle Consulting to maximise performance on Exadata. So, it's out there. Of course, partitions can be thought of as distinct tables grouped into a super table, so your idea of writing to separate tables is not a million miles from what is being suggested here. The advantage with partitions it is a more natural mechanism for group this data, and adding a new partition will require less work than adding a new table when expanded.

SQLite DB is locked exception, how do I unlock it if I haven't ever toyed with it?

java.sql.SQLException: database is locked
at org.sqlite.DB.throwex(DB.java:288)
at org.sqlite.NestedDB.prepare(NestedDB.java:115)
at org.sqlite.DB.prepare(DB.java:114)
at org.sqlite.Stmt.executeQuery(Stmt.java:89)
When I make a query I get this exception. I read up on it on SA and Google, and the most common conclusion is that someone started making another query which never finished. The problem I'm having is that I've never made a query on this DB on this machine before. I downloaded the db file from where I hosted it (I created it earlier) and haven't done anything with it, so I don't know why it would be locked. When I do a query using a program called SQLite Database Browser, it works just fine. Thanks for the help, I'll provide more info if need be, just let me know.
adapter = new DbAdapter();
ResultSet info;
ResultSet attributes;
for (int i = 1; i < 668; i++) {
if (i%50 == 0) {
System.out.print('.');
}
info = adapter.makeQuery("SELECT * FROM vehicles WHERE id = '" + i + "'");
attributes = adapter.makeQuery("SELECT * FROM vehicle_moves WHERE vehicle_id = '" + i + "'");
if(info.next()) {
base = new (info, attributes);
}
vehicleArray[i] = base;
}
System.out.println("Done.");
info.close();
attributes.close();
adapter.close();
Above is the code where this is occurring. I did some homework throughout my code and sure enough the problem is in this code, other DB queries work just fine. Anything jump out at you guys?
SQLite itself can most certainly handle doing a query while the results of another query are being processed. It'd be terribly useless if that couldn't be done! What's more likely to cause problems is if you've got two connections to the database open at once. I don't know that DbAdapter class at all – not what package it is in, or what module provides it – but if it is assuming that it can open many connections (or if it isn't maintaining proper connection hygiene) then that most certainly would be a cause of the sort of problems you're seeing. Look there first.

Database deletes failed during inserts

I have two java apps: one of them inserts records to Table1.
Second application reads first N items and removes them.
When 1st application inserts data intensive, 2nd failed when I try to delete any rows with CannotSerializeTransactionException. I don't see any problems: inserted items are visible in select/delete only when insert transaction is finished. How can I fix it? Thanks.
TransactionTemplate tt = new TransactionTemplate(platformTransactionManager);
tt.setIsolationLevel(Connection.TRANSACTION_SERIALIZABLE);
tt.execute(new TransactionCallbackWithoutResult() {
#Override
protected void doInTransactionWithoutResult(TransactionStatus status) {
List<Record> records = getRecords(); // jdbc select
if (!records.isEmpty()) {
try {
processRecords(records); // no database
removeRecords(records); // jdbc delete - exception here
} catch (CannotSerializeTransactionException e) {
log.info("Transaction rollback");
}
} else {
pauseProcessing();
}
}
});
pauseProcessing() - sleep
public void removeRecords(int changeId) { String sql = "delete from RECORDS where ID <= ?";
getJdbcTemplate().update(sql, new Object[]{changeId});}
Are you using Connection.TRANSACTION_SERIALIZABLE also in first application? Looks like first application locks table, so second one cannot access it (cannot start transaction). Maybe Connection.TRANSACTION_REPEATABLE_READ could be enough?
Probably you can also configure second application not to throw exception when it cannot access resources, but to wait for it.
This sounds as if you're reading uncommitted data. Are you sure you're properly settings the isolation level?
It seems to me that you're mixing up constants from two different classes: Shouldn't you be passing TransactionDefinition.ISOLATION_SERIALIZABLE instead of Connection.TRANSACTION_SERIALIZABLE to the setIsolationLevel method?
Why do you set the isolation level anyway? Oracle's default isolation level (read committed) is usually the best compromise between consistency and speed and should nicely work in you case.

Categories

Resources