Set the database current timestamp while inserting using JOOQ - java

I am using the following code segment to do the insertion using JOOQ's UpdatableRecord.
public void acknowledgeDisclaimer(AcknowledgeDisclaimerReq acknowledgeDisclaimerReq) {
DisclaimerRecord disclaimerRecord = dslContext.newRecord(Disclaimer.DISCLAIMER);
disclaimerRecord.setDisclaimerForId(acknowledgeDisclaimerReq.getDealListingId());
disclaimerRecord.setDisclaimerForType("DEAL");
disclaimerRecord.setAcceptedAt(LocalDateTime.now());
disclaimerRecord.setAcceptedByOwnerId(acknowledgeDisclaimerReq.getLoggedInOwnerId());
int count = disclaimerRecord.store();
log.info("Inserted entry for disclaimer for deal: {}, owner: {}, id {}, insertCount: {}", disclaimerRecord.getDisclaimerForId(), disclaimerRecord.getAcceptedByOwnerId(), disclaimerRecord.getId(), count);
}
When setting the AcceptedAt data, I want to use the database's current timestamp instead of passing the JVM timestamp. Is there any way to do that in JOOQ?

UpdatableRecord.store() can only set Field<T> => T key/values, not Field<T> => Field<T>, so you cannot set an expression in your record. You can obviously run an explicit INSERT / UPDATE / MERGE statement instead.
Using triggers
The best way to ensure such a timestamp is set to the database timestamp whenever you run some specific DML on the table is to use a database trigger (you could make the trigger watch for changes in the ACCEPTED_BY_OWNER_ID value)
If you can't do this on the server side (which is the most reliable, because it will behave correctly for all database clients, not just the JDBC/jOOQ based ones), you might have a few client side options in jOOQ:
Using jOOQ 3.17 client side computed columns
jOOQ 3.17 has added support for stored (or virtual) client side computed columns, a special case of which are audit columns (which is almost what you're doing).
Using this, you can specify, for example:
<forcedType>
<generator><![CDATA[
ctx -> org.jooq.impl.DSL.currentTimestamp()
]]></generator>
<includeExpression>(?i:ACCEPTED_AT)</includeExpression>
</forcedType>
The above acts like a trigger that sets the ACCEPTED_AT date to the current timestamp every time you write to the table. In your case, it'll be more like:
<forcedType>
<generator><![CDATA[
ctx -> org.jooq.impl.DSL
.when(ACCEPTED_BY_OWNER_ID.isNotNull(), org.jooq.impl.DSL.currentTimestamp())
.else_(ctx.table().ACCEPTED_AT)
]]></generator>
<includeExpression>(?i:ACCEPTED_AT)</includeExpression>
</forcedType>
See a current limitation of the above here:
https://github.com/jOOQ/jOOQ/issues/13809
See the relevant manual sections here:
Client side computed columns
Audit columns

Should be something like disclaimerRecord.setAcceptedAt(DSL.now());

Related

Spark writing to Cassandra with varying TTL

In Java Spark, I have a dataframe that has a 'bucket_timestamp' column, which represents the time of the bucket that the row belongs to.
I want to write the dataframe to a Cassandra DB. The data must be written to the DB with TTL. The TTL should be depended on the bucket timestamp - where each row's TTL should be calculated as ROW_TTL = CONST_TTL - (CurrentTime - bucket_timestamp), where CONST_TTL is a constant TTL that I configured.
Currently I am writing to Cassandra with spark using a constant TTL, with the following code:
df.write().format("org.apache.spark.sql.cassandra")
.options(new HashMap<String, String>() {
{
put("keyspace", "key_space_name");
put("table, "table_name");
put("spark.cassandra.output.ttl, Long.toString(CONST_TTL)); // Should be depended on bucket_timestamp column
}
}).mode(SaveMode.Overwrite).save();
One possible way I thought about is - for each possible bucket_timestamp - filter the data according to timestamp, calculate the TTL and write filtered data to Cassandra. but this seems very non-efficient and not the spark way. Is there a way in Java Spark to provide a spark column as the TTL option, so that the TTL will differ for each row?
Solution should be working with Java and dataset< Row>: I encountered some solutions for performing this with RDD in scala, but didn't find a solution for using Java and dataframe.
Thanks!
From Spark-Cassandra connector options (https://github.com/datastax/spark-cassandra-connector/blob/v2.3.0/spark-cassandra-connector/src/main/java/com/datastax/spark/connector/japi/RDDAndDStreamCommonJavaFunctions.java) you can set the TTL as:
constant value (withConstantTTL)
automatically resolved value (withAutoTTL)
column-based value (withPerRowTTL)
In your case you could try the last option and compute the TTL as a new column of the starting Dataset with the rule you provided in the question.
For use case you can see the test here: https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/it/scala/com/datastax/spark/connector/writer/TableWriterSpec.scala#L612
For DataFrame API there is no support for such functionality, yet... There is JIRA for it - https://datastax-oss.atlassian.net/browse/SPARKC-416, you can watch it to get notified when it's implemented...
So only choice that you have is to use RDD API as described in the #bartosz25's answer...

No response with a query by ID on Azure DocumentDB

I'm currently facing very slow/ no response on a collection looking by ID. I have ~ 2 milion of documents in a partitioned collection. If lookup the document using the partitionKey and id the response is immediate
SELECT * FROM c WHERE c.partitionKey=123 AND c.id="20566-2"
if I try using only the id
SELECT * FROM c WHERE c.id="20566-2"
the response never returns, java client seems freezed and I have the same situation using the Data Explorer from Azure Portal. I tried also looking up by another field that isn't the id or the partitionKey and the response always returns. When I try the select from Java client I always set the flag to enable cross partition query.
The next thing to try is to avoid the character "-" in the ID to test if this character blocks the query (anyway I didn't find anything on the documentation)
The issue is related to your Java code. Due to Azure DocumentDB Java SDK wrapped the DocumentDB REST APIs, according to the reference of REST API Query Documents, as #DanCiborowski-MSFT said, the header x-ms-documentdb-query-enablecrosspartition explains your issue reason as below.
Header: x-ms-documentdb-query-enablecrosspartition
Required/Type: Optional/Boolean
Description: If the collection is partitioned, this must be set to True to allow execution across multiple partitions. Queries that filter against a single partition key, or against single-partitioned collections do not need to set the header.
So you need to set True to enable cross partition for querying across multiple partitions without a partitionKey in where clause via pass a instance of class FeedOption to the method queryDocuments, as below.
FeedOptions queryOptions = new FeedOptions();
queryOptions.setEnableCrossPartitionQuery(true); // Enable query across multiple partitions
String collectionLink = collection.getSelfLink();
FeedResponse<Document> queryResults = documentClient.queryDocuments(
collectionLink,
"SELECT * FROM c WHERE c.id='20566-2'", queryOptions);

Flyway partial migration of legacy application

In an application with a custom database migrator which we want to replace with Flyway.
These migrations are split into some categories like "account" for user management and "catalog" for the product catalog.
Files are named $category.migration.$version.sql. Here, $category is one of the above categories and $version is an integer version starting from 0.
e.g. account.migration.23.sql
Although one could argue that each category should be a separate database, in fact it isn't and a major refactoring would be required to change that.
Also I could use one schema per category, but again this would require rewriting all SQL queries.
So I did the following:
Move $category.migration.$version.sql to /sql/$category/V$version__$category.sql (e.g. account.migration.1.sql becomes /sql/account/V1_account.sql)
Use a metadata table per category
set the baseline version to zero
In code that would be
String[] _categories = new String[] { "catalog", "account" };
for (String _category : _categories) {
Flyway _flyway = new Flyway();
_flyway.setDataSource(databaseUrl.getUrl(), databaseUrl.getUser(), databaseUrl.getPassword());
_flyway.setBaselineVersion(MigrationVersion.fromVersion("0"));
_flyway.setLocations("classpath:/sql/" + applicationName);
_flyway.setTarget(MigrationVersion.fromVersion(_version + ""));
_flyway.setTable(category + "_schema_version");
_flyway.setBaselineOnMigrate(true); // (1)
_flyway.migrate();
}
So there would be the metadata tables catalog_schema_version and account_schema_version.
Now the issue is as follows:
Starting with an empty database I would like to apply all pre-existing migrations per category, as done above.
If I remove _flyway.setBaselineOnMigrate(true); (1), then the catalog migration (the first one) succeeds, but it would complain for account that the schema public is not empty.
Likewise setting _flyway.setBaselineOnMigrate(true); causes the following behavior:
The migration of "catalog" succeeds but V0_account.sql is ignored and Flyway starts with V1_account.sql, maybe because it somehow still thinks the database was already baselined?
Does anyone have a a suggestion for resolving the problem?
Your easiest solution is to keep the schema_version tables in another schema each. I've answered a very similar question here.
Regarding your observation on baseline, those are expected traits. The migration of account starts at v1 because with the combination of baseline=0, baselineOnMigrate=true and a non empty target schema (because catalog has populated it) Flyway has determined this is a pre-existing database that is equal to the baseline - thus start at v1.

Proper way to insert record with unique attribute

I am using spring, hibernate and postgreSQL.
Let's say I have a table looking like this:
CREATE TABLE test
(
id integer NOT NULL
name character(10)
CONSTRAINT test_unique UNIQUE (id)
)
So always when I am inserting record the attribute id should be unique
I would like to know what is better way to insert new record (in my spring java app):
1) Check if record with given id exists and if it doesn't insert record, something like this:
if(testDao.find(id) == null) {
Test test = new Test(Integer id, String name);
testeDao.create(test);
}
2) Call straight create method and wait if it will throw DataAccessException...
Test test = new Test(Integer id, String name);
try{
testeDao.create(test);
}
catch(DataAccessException e){
System.out.println("Error inserting record");
}
I consider the 1st way appropriate but it means more processing for DB. What is your opinion?
Thank you in advance for any advice.
Option (2) is subject to a race condition, where a concurrent session could create the record between checking for it and inserting it. This window is longer than you might expect because the record might be already inserted by another transaction, but not yet committed.
Option (1) is better, but will result in a lot of noise in the PostgreSQL error logs.
The best way is to use PostgreSQL 9.5's INSERT ... ON CONFLICT ... support to do a reliable, race-condition-free insert-if-not-exists operation.
On older versions you can use a loop in plpgsql.
Both those options require use of native queries, of course.
Depends on the source of your ID. If you generate it yourself you can assert uniqueness and rely on catching an exception, e.g. http://docs.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html
Another way would be to let Postgres generate the ID using the SERIAL data type
http://www.postgresql.org/docs/8.1/interactive/datatype.html#DATATYPE-SERIAL
If you have to take over from an untrusted source, do the prior check.

Database timestamps not matching

I have an action in struts2 that will query the database for an object and then copy it with a few changes. Then, it needs to retrieve the new objectID from the copy and create a file called objectID.txt.
Here is relevant the code:
Action Class:
ObjectVO objectVOcopy = objectService.searchObjects(objectId);
//Set the ID to 0 so a new row is added, instead of the current one being updated
objectVOcopy.setObjectId(0);
Date today = new Date();
Timestamp currentTime = new Timestamp(today.getTime());
objectVOcopy.setTimeStamp(currentTime);
//Add copy to database
objectService.addObject(objectVOcopy);
//Get the copy object's ID from the database
int newObjectId = objectService.findObjectId(currentTime);
File inboxFile = new File(parentDirectory.getParent()+"\\folder1\\folder2\\"+newObjectId+".txt");
ObjectDAO
//Retrieve identifying ID of copy object from database
List<ObjectVO> object = getHibernateTemplate().find("from ObjectVO where timeStamp = ?", currentTime);
return object.get(0).getObjectId();
The problem is that more often than not, the ObjectDAO search method will not return anything. When debugging I've noticed that the Timestamp currentTime passed to it is usually about 1-2ms off the value in the database. I have worked around this bug changing the hibernate query to search for objects with a timestamp within 3ms of the one passed, but I'm not sure where this discrepancy is coming from. I'm not recalculating the currentTime; I'm using the same one to retrieve from the database as I am to write to the database. I'm also worried that when I deploy this to another server the discrepancy might be greater. Other than the objectID, this is the only unique identifier so I need to use it to get the copy object.
Does anyone know why this is occuring and is there a better work around than just searching through a range? I'm using Microsoft SQL Server 2008 R2 btw.
Thanks.
Precision in SQL Server's DATETIME data type does not precisely match what you can generate in other languages. SQL Server rounds to the nearest 0.003 - this is why you can say:
DECLARE #d DATETIME = '20120821 23:59:59.997';
SELECT #d;
Result:
2012-08-21 23:59:59.997
Then try:
DECLARE #d DATETIME = '20120821 23:59:59.999';
SELECT #d;
Result:
2012-08-22 00:00:00.000
Since you are using SQL Server 2008 R2, you should make sure to use the DATETIME2 data type instead of DATETIME.
That said, #RedFilter makes a good point - why are you relying on the time stamp when you can use the generated ID instead?
This feels wrong.
Other than the objectID, this is the only unique identifier
Databases have the concept of a unique identifier for a reason. You should really use that to retrieve an instance of your object.
You can use the get method on the Hibernate session and take advantage of the session and second level caches as well.
With your approach you execute a query everytime you retrieve your object.

Categories

Resources