I am working on a project where we are using dynamoDB as the database.
I used the TableUtils of import com.amazonaws.services.dynamodbv2.util.TableUtils;
to create table if it does not exist.
CreateTableRequest tableRequest = dynamoDBMapper.generateCreateTableRequest(cls);
tableRequest.setProvisionedThroughput(new ProvisionedThroughput(5L, 5L));
boolean created = TableUtils.createTableIfNotExists(amazonDynamoDB, tableRequest);
Now after creating table i have to push the data once it is active.
I saw there is a method to do this
try {
TableUtils.waitUntilActive(amazonDynamoDB, cls.getSimpleName());
} catch (Exception e) {
// TODO: handle exception
}
But this is taking 10 minutes.
Is there a method in TableUtils which return as soon as table becomes active.
You may try something as follows.
Table table = dynamoDB.createTable(request);
System.out.println("Waiting for " + tableName + " to be created...this may take a while...");
table.waitForActive();
For more information check out this link.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AppendixSampleDataCodeJava.html
I had implemented the solution for this in GO language.
Here is the summary.
You have to use an API - DescribeTable or corresponding API.
The input to this API will be DescribeTableInput, where you specify the table name.
You will need to do polling in a loop till the table becomes active.
The output of the Describe table will provide you status of the table ( result.Table.TableStatus)
If the status is "ACTIVE" then you can insert the info. Else you will need to continue with the loop.
In my case, the tables are becoming active in less than one minute.
Related
Here I have a dilemma.
Let's imagine that we have a sql table like this
enter image description here
It could be a problem when two or more users overwrite data in the table.
How should I check if the place hasn't been taken before update data?
I have two options
in SQL query:
UPDATE ticket SET user_user_id = ? WHERE place = ? AND user_user_id is NULL
or in Service layer:
try {
Ticket ticket = ticketDAO.read(place)
if (ticket.getUser() == null) {
ticket.setUser(user)
ticketDAO.update(ticket)
}else {
throw new DAOException("Place has been already tooken")
}
What way is safer and commonly used in practice?
Please, share your advice.
Possible approach here is to go ahead with SQL query. After query execution check number of rows modified in ticketDAO.update method. If 0 rows modified then throw exception DAOException.
I have a requirement to update more than 1000000 records in DB2 database.
I tried using hibernate with multi threaded application updating the records. However, on doing so I was getting lockacquisitionexception. I feel it's because of the bulk commits I am doing along with multiple threads.
Can someone recommend a better solution or better way to do so.
Please let me know if I need to upload the code I am using.
Thanks in advance.
//Code running multiple times with threads
Transaction tx = null;
tx = session.beginTransaction();
for(EncryptRef abc : arList) {
String encrypted = keyUtils.encrypt(abc.getNumber()); //to encrypt some data
Object o = session.load(EncryptRef.class,new Long(abc.getId())); //primary key EncryptRef object = (EncryptRef)o;
object.setEncryptedNumber(encrypted); //updating the row
}
tx.commit(); //bulk commiting the updates
Table contains just three columns. ID|PlainText|EncryptedText
Update:
I tried batch updates using JDBC prepared statemenets. However, I am still facing the below exception:
com.ibm.db2.jcc.am.BatchUpdateException:
[jcc][t4][102][10040][3.63.75] Batch failure. The batch was
submitted, but at least one exception occurred on an individual member
of the batch. Use getNextException() to retrieve the exceptions for
specific batched elements. ERRORCODE=-4229, SQLSTATE=null at
com.ibm.db2.jcc.am.fd.a(fd.java:407) at
com.ibm.db2.jcc.am.n.a(n.java:386) at
com.ibm.db2.jcc.am.zn.a(zn.java:4897) at
com.ibm.db2.jcc.am.zn.c(zn.java:4528) at
com.ibm.db2.jcc.am.zn.executeBatch(zn.java:2837) at
org.npci.ThreadClass.run(ThreadClass.java:63) at
java.lang.Thread.run(Thread.java:748)
Below is the code executed with batch size of 50-100 records:
String queryToUpdate = "UPDATE INST1.ENCRYPT_REF SET ENCR_NUM=? WHERE ID=?";
PreparedStatement pstmtForUpdate = conn.prepareStatement(queryToUpdate);
for (Map.Entry<Long,String> entry : encryptMap.entrySet()) {
pstmtForUpdate.setString(1, entry.getValue());
pstmtForUpdate.setLong(2, entry.getKey());
pstmtForUpdate.addBatch();
}
pstmtForUpdate.executeBatch();
conn.close();
Without knowing anything about your database structure it’s hard to recommend a specific solution. If you can change the database, a good strategy would be to partition your table and then arrange for each thread to update a separate partition. Instead of having multiple threads updating one large database and conflicting with each other, you would effectively have each thread each updating its own smaller database.
You should also make sure you’re effectively batching updates and not committing too often.
If your table has tons of indexes, it might be more efficient to drop some/all and rebuild after your update than to update then on an ongoing basis. Similarly you might consider removing triggers, referential integrity constraints, etc., then patching up later.
Not an answer to the question. Used for better formatting.
To catch the actual db2 SQLCODE use the following technique. Otherwise it's impossible to understand the root cause of the problem.
try {
...
} catch (SQLException ex) {
while (ex != null) {
if (ex instanceof com.ibm.db2.jcc.DB2Diagnosable) {
com.ibm.db2.jcc.DB2Diagnosable db2ex =
(com.ibm.db2.jcc.DB2Diagnosable) ex;
com.ibm.db2.jcc.DB2Sqlca sqlca = db2ex.getSqlca();
if (sqlca != null) {
System.out.println("SQLCODE: " + sqlca.getSqlCode());
System.out.println("MESSAGE: " + sqlca.getMessage());
} else {
System.out.println("Error code: " + ex.getErrorCode());
System.out.println("Error msg : " + ex.getMessage());
}
} else {
System.out.println("Error code (no db2): " + ex.getErrorCode());
System.out.println("Error msg (no db2): " + ex.getMessage());
}
ex = ex.getNextException();
}
...
}
As for ENCR_NUM field.
Is it possible to have actual values for this column outside of your application?
Or are such values can be generated by your application only?
Do you have to update all the table rows or is there some condition on the set of IDs which need to be updated?
Can anyone help me, how to create a function in java to do scanning every 5 seconds to know the existence of new data entered in a mysql table.
In case you need java solution only, you can do it with timer & timer task provided by java.
Here is the code.
java.util.TimerTask task = new java.util.TimerTask() {
int prevCount = 0; // you can declare it static
#Override
public void run() {
Connection conn = getConnection();
try {
ResultSet rs = conn.prepareStatement("Select Count(*) from table").executeQuery();
int count = rs.getInt(1);
System.out.println("Count diff:"+ (prevCount-count));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
};
java.util.Timer timer = new java.util.Timer(true);// true to run timer as daemon thread
timer.schedule(task, 0, 5000);// Run task every 5 second
try {
Thread.sleep(60000); // Cancel task after 1 minute.
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
timer.cancel();
Making searches in big tables - especially in big tables - is a heavy operation. So, probably, you can reduce amount of table reads by detecting new data in other way.
For example, you can check table's size before actual data fetch. For doing that you can just perform "select count(*) from table" operation or even calculate table's size on disk like here: How to get the sizes of the tables of a mysql database?
Variant with database trigger also can help. For example, what if your trigger will update some marker of the last table's update on which your java app will look. That variant also will help to avoid performing idle reads of your table.
You don't need a java program to scan changes in DB , apart from a redundant it will be costly to network and DB, it's not a good practice to implement a feature that is already been given as a standard solution.
CREATE TRIGGER `some_update_happened` BEFORE/AFTER INSERT/UPDATE/DELETE
ON `mydb`.`mytable`
FOR EACH ROW BEGIN
// your code here for db trigger calling java function
END;
What you can do is use DB Update Triggers refer this link & this link .
After implementing trigger I suppose you need to catch trigger using a service implemented in php, java etc. You need to implement event listener for receiving trigger automation. just like done here for PHP and there is also an example for it in oracle docs but that is for Oracle. here is an example in java+mysql .
I understand you are a beginner , just go step by step , error by error and you will get there. good luck.
I am using spring, hibernate and postgreSQL.
Let's say I have a table looking like this:
CREATE TABLE test
(
id integer NOT NULL
name character(10)
CONSTRAINT test_unique UNIQUE (id)
)
So always when I am inserting record the attribute id should be unique
I would like to know what is better way to insert new record (in my spring java app):
1) Check if record with given id exists and if it doesn't insert record, something like this:
if(testDao.find(id) == null) {
Test test = new Test(Integer id, String name);
testeDao.create(test);
}
2) Call straight create method and wait if it will throw DataAccessException...
Test test = new Test(Integer id, String name);
try{
testeDao.create(test);
}
catch(DataAccessException e){
System.out.println("Error inserting record");
}
I consider the 1st way appropriate but it means more processing for DB. What is your opinion?
Thank you in advance for any advice.
Option (2) is subject to a race condition, where a concurrent session could create the record between checking for it and inserting it. This window is longer than you might expect because the record might be already inserted by another transaction, but not yet committed.
Option (1) is better, but will result in a lot of noise in the PostgreSQL error logs.
The best way is to use PostgreSQL 9.5's INSERT ... ON CONFLICT ... support to do a reliable, race-condition-free insert-if-not-exists operation.
On older versions you can use a loop in plpgsql.
Both those options require use of native queries, of course.
Depends on the source of your ID. If you generate it yourself you can assert uniqueness and rely on catching an exception, e.g. http://docs.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html
Another way would be to let Postgres generate the ID using the SERIAL data type
http://www.postgresql.org/docs/8.1/interactive/datatype.html#DATATYPE-SERIAL
If you have to take over from an untrusted source, do the prior check.
I am working on a system in which I need to store Avro Schemas in Cassandra database. So in Cassandra we will be storing something like this
SchemaId AvroSchema
1 some schema
2 another schema
Now suppose as soon as I insert another row in the above table in Cassandra and now the table is like this -
SchemaId AvroSchema
1 some schema
2 another schema
3 another new schema
As soon as I insert a new row in the above table - I need to tell my Java program to go and pull the new schema id and corresponding schema..
What is the right way to solve these kind of problem?
I know, one way is to have polling every few minutes, let's say every 5 minutes we will go and pull the data from the above table but this is not the right way to solve this problem as every 5 minutes, I am doing a pull whether or not there are any new schemas..
But is there any other solution apart from this?
Can we use Apache Zookeeper? Or Zookeeper is not fit for this problem?
Or any other solution?
I am running Apache Cassandra 1.2.9
Some solutions:
With database triggers: Cassandra 2.0 has some trigger support but it looks like it is not final and might change a little in 2.1 according to this article: http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support. Triggers are a common solution.
You brought up polling but that is not always a bad option. Especially if you have something that marks that row as not being pulled yet, so you can just pull the new rows out of Cassandra. Pulling once every 5 minutes is nothing load wise for Cassandra or any database if the query is not a heavy cost. This option might not be good if new rows get inserted on a very infrequent basis.
Zookeeper would not be a perfect solution, see this quote:
Because watches are one time triggers and there is latency between
getting the event and sending a new request to get a watch you cannot
reliably see every change that happens to a node in ZooKeeper. Be
prepared to handle the case where the znode changes multiple times
between getting the event and setting the watch again. (You may not
care, but at least realize it may happen.)
Quote sourced from: http://zookeeper.apache.org/doc/r3.4.2/zookeeperProgrammers.html#sc_WatchRememberThese
Cassandra 3.0
You can use this and it will get you everything in the insert as a json object.
public class HelloWorld implements ITrigger
{
private static final Logger logger = LoggerFactory.getLogger(HelloWorld.class);
public Collection<Mutation> augment(Partition partition)
{
String tableName = partition.metadata().cfName;
logger.info("Table: " + tableName);
JSONObject obj = new JSONObject();
obj.put("message_id", partition.metadata().getKeyValidator().getString(partition.partitionKey().getKey()));
try {
UnfilteredRowIterator it = partition.unfilteredIterator();
while (it.hasNext()) {
Unfiltered un = it.next();
Clustering clt = (Clustering) un.clustering();
Iterator<Cell> cells = partition.getRow(clt).cells().iterator();
Iterator<ColumnDefinition> columns = partition.getRow(clt).columns().iterator();
while(columns.hasNext()){
ColumnDefinition columnDef = columns.next();
Cell cell = cells.next();
String data = new String(cell.value().array()); // If cell type is text
obj.put(columnDef.toString(), data);
}
}
} catch (Exception e) {
}
logger.debug(obj.toString());
return Collections.emptyList();
}
}