How much can I trust OracleDB's ROWID in a long run? - java

I am working on a small POC using Spring Boot and OracleDB.
The situation is :
While application startup, I load few properties (some data) from the DB in the cache. There are going to be frequent request where I will be needing this data, hence I decided to cache it. The data in the DB will rarely change. Only once in a while some one can insert/delete/update a couple of rows using the sql script. While it changes, I have implemented Oracle's DB change notification to send notification to the spring boot service that some data has changed and the data in the cache is is now in the stale state.
In the notification event, I only get the ROWID pseudocolumn which can be used to point to what portion of data from the db is different from the cache that I have. To be on the safer side, I have decided to cache ROWIDs to map the data in cache and data object in the notification event that DB sends me. While working for a couple of days, I have found out that the ROWID doesn't change but how much shall I trust this non-changing behavior of the ROWIDs in the long run or in the production environment?
Few Scenarios explained for clarification:
Cache will reload itself every time the server restarts. Therefore, data change while the server is down situation is out of picture.
I am (up until the poc) getting every insert/update/delete notification in the made in the db using the sql query/script.
Example of event.toString() for reference:
Connection information : local=view-localhost/127.0.0.1:47632, remote=view-localhost/127.0.0.1:57117
Registration ID : 1201
Notification version : 1
Event type : QUERYCHANGE
Database name : orcl
Query Change Description (length=1)
query ID=41, query change event type=QUERYCHANGE
Table Change Description (length=1): operation=[INSERT], tableName=SYSTEM.PRODUCT, objectNumber=73323
Row Change Description (length=1):
ROW: operation=INSERT, ROWID=AAAR5rAABAAAbHZAAA

Assuming your table does not have row_movement enabled (check dba_tables).
You need to be careful of deletes then inserts - these will logically give a row a new rowid (it’s a completely new row after all).
You also will need to be aware of table moves, this is an intensive operation that requires indexes are rebuilt anyway so is unlikely to happen without much notice.
Otherwise, a row will keep it’s rowid.

Related

Configuring database change notification to get only newly inserted or updated data in Java

I am building an application that does some processing after looking up a database (oracle).
Currently, I have configured the application with Spring Integration and it polls data in a periodic fashion regardless of whether any data is updated or inserted.
The problem here is, I cannot add or use any column to distinguish between old and new records. Also, for no insert or update in table as well, poller polls data from database and feeds the data into message channel.
For that, I want to switch to database change notification and I need to register the query something like
SELECT * FROM EMPLOYEE WHERE STATUS='ACTIVE'
now this active status is true for old and new entries and I want to eliminate the old entries from my list. So that, only after a new insert or an existing update, I want to get data which are added newly or updated recently.
Well, that is really very sad that you can't modify the data model in the database. I'd really suggest to try to insist to change the table for your convenience. For example might really be just one more column LAST_MODIFIED, so could to filter the old records and only poll those which date is very fresh.
There is also possibility in Oracle like trigger, so you can perform some action on INSERT/UPDATE and modify some other table for your interest.
Otherwise you don't have choice unless use one more extra persistence service to track loaded records. For example MetadataStore based on Redis or MongoDB: https://docs.spring.io/spring-integration/docs/4.3.12.RELEASE/reference/html/system-management-chapter.html#metadata-store

MySQL check if table has some records changed

I am working on an Java application which uses MySQL database as the data storage layer. There are few configuration tables in database, but each table has many thousands of records / rows. These all configuration is cached / loaded in memory in corresponding data structures / beans(JAVA POJO's) when application starts up.
Everything is fine except that every time the application starts the caching takes place and this usually takes 15-20 minutes, as the data to be cached is huge and also some columns have XML string which is parsed and then stored in beans.
So what's the big deal??
Why should we cache when no data is changed between consecutive start-up's.?? I can have all the beans encapsulated in a common Config bean and serialize it. And load this serialized object the next time when I figure out no data is changed - and yes of course loading serialized object is far faster then database hit plus bean population.
So is there any way I can figure this out?
Of course at database level. I would query when the application starts - Was there any change in the database tables since it was last started. If yes do the same old boring caching process and store some unique identifier and serialize, Or if last identifier and current identifier are same just load the serialized object. This unique identifier will of course be persistent.
Add an last_updated column of type timestamp to the table.
When you need to check if there are changes on the table simply execute the query:
select max(last_updated) from YOUR_TABLE
If the last_updated is after the time you created the last cache copy you can update the cache with only the elements changed since last creation of the cache with a query similar to this one:
select * from YOUR_TABLE where last_updated > LAST_CACHE_UPDATE
As explained in the comments is higly recomandable to add an index on the column last_updated. Using an index give you the possibility to retrieve the maximum value in a table of 1.000.000.000 records in 30 steps (not 1.000.000.000 as wrong mentioned in the comments).
If you restart your application a lot and your cache can live in off memory data structure like redis or hazelcast, use that as cache, not the jvm memory. When update data, update both sides.

Is checksum a good way to see if table has been modified in MySQL?

I'm currently developing an application in Java that connects to a MySQL database using JDBC, and displays records in jTable. The application is going to be run by more than one user at a time and I'm trying to implement a way to see if the table has been modified. EG if user one modifies a column such as stock level, and then user two tries to access the same record tries to change it based on level before user one interacts.
At the moment I'm storing the checksum of the table that's being displayed as a variable and when a user tries to modify a record it will do a check whether the stored checksum is the same as the one generated before the edit.
As I'm new to this I'm not sure if this a correct way to do it or not; as I have no experience in this matter.
Calculating the checksum of an entire table seems like a very heavy-handed solution and definitely something that wouldn't scale in the long term. There are multiple ways of handling this but the core theme is to do as little work as possible to ensure that you can scale as the number of users increase. Imagine implementing the checksum based solution on table with million rows continuously updated by hundreds of users!
One of the solutions (which requires minimal re-work) would be to "check" the stock name against which the value is updated. In the background, you'll fire across a query to the table to see if the data for "that particular stock" has been updated after the table was populated. If yes, you can warn the user or mark the updated cell as dirty to indicate that that value has changed. The problem here is that the query won't be fired off till the user tries to save the updated value. Or you could poll the database to avoid that but again hardly an efficient solution.
As a more robust solution, I would recommend using a database which implements native "push notifications" to all the connected clients. Redis is a NoSQL database which comes to mind for this.
Another tried and tested technique would be to forgo direct database connection and use a middleware layer like a messaging queue (e.g. RabbitMQ). Message queues enable design of systems which communicate using message. So for e.g. every update the stock value in the JTable would be sent across as a message to an "update database queue". Once the update is done, a message would be sent across to a "update notification queue" to which all clients would be connected. This will enable all of them to know that the value of a given stock has been updated and act accordingly. The advantage to this solution is that you get to keep your existing stack (Java, MySQL) and can implement notifications without polling the DB and killing it.
Checksum is a way to see if data has changed.
Anyway I would suggest you store a column "last_update_date", this column is supposed to be always updated at every update of the record.
So you juste have to store this date (precision date time) and do the check with that.
You can also add a column version number : a simple counter incremented by 1 at each update.
Note:
You can add a trigger on update for updating last_update_date, it should be 100% reliable, maybe you don't need a trigger if you control all updates.
When using in network communication:
A checksum is a count of the number of bits in a transmission unit
that is included with the unit so that the receiver can check to see
whether the same number of bits arrived. If the counts match, it's
assumed that the complete transmission was received.
So it can be translated to check 2 objects are different, your approach is correct.

Is there any heuristic/pattern for logging user actions

I have a GWT/Java/Hibernate/MySQL application (but I think any web pattern could be valid) that do a CRUD on several objects. Each object is stored in a table in the database. I want to implement an action logger. For example for Object A I want to know who created it and modified it, and for User B, what actions did he perform.
My idea is to have a History table that stores : UserId, ObjectId, ActionName. The UserId and ObjectId are foreign keys. Am I on the right track ?
I also think this is the right direction.
However, bare in mind that in an application with lots of traffic, this logs can become overhead.
I would suggest the following in this case -
A. Don't use hibernate for this "action logging" - Hibernate has better performance for "mostly read DB"
B. Consider DB that is better in "mostly write" scenario for the action logging table.
You can try to look for a NoSQL solution for this.
C. If you use such NoSQL DB, but still want to keep the logging actions in the relational DB, have an offline process that runs once in a day for example), that will query your "action logging DB" and will insert it to the relational DB.
D. If it's ok that your system might lose some action logging, consider using producer/consumer pattern (for example - use a queue between producer and consumer thread) - the threads that need to log actions will not log them synchronously, but will log them asynchronously.
E. In addition, don't forget that such logging table has the potential to be over-flooded in time, causing queries on it to take a long time. For these issues consider the following:
E.1. Every day remove really old logs - let's say - older than month, or move them to some "backup" table.
E.2 Index some fields that you mostly use for action logging queries (for example - maybe an action_type) field.
If only changes to specific fields, e.g., something like status in a users table, should be tracked, I would use a user_status_histories table being referenced from the users table via foreign key. The user_status_histories table would contain fields such as current_status, date and something like admin_who_modified_the_status.
Whenever a status change is made, a new record would be inserted into the user_status_histories table. This would allow easy querying of all status changes.
Of course, querying a user would then require a (LEFT or INNER) JOIN with the user_status_histories table in order to get the last record (= the current status).
Depending on your needs, you might think of a current_status field in the users table (besides the status serving as foreign key) for fast access, which would be maintained parallel to the user_status_histories table.
Yes you are. Another very similar framework is one which supports undo and redo. These frameworks track user actions and have the additional ability to restore state to the way it was before the user action.

database polling using Java

I am stuck at some point wherein I need to get database changes in a Java code. Request is to get any record updated, added, deleted in any table of db; should be recognized by Java program. How could it be implemented JMS? or a Java thread?
Update: Thanks guys for your support i am actually using Oracle as DB and Weblogic 10.3 workshop. Actually I want to get the updates from a table in which I have only read permission so guys what do you all suggest. I can't update the DB. Only thing I can do is just read the DB and if there is any change in the table I have to get the information/notification that certain data rows has been added/deleted or updated.
Unless the database can send a message to Java, you'll have to have a thread that polls.
A better, more efficient model would be one that fires events on changes. A database that has Java running inside (e.g., Oracle) could do it.
We do it by polling the DB using an EJB timer task. In essence, we have a status filed which we update when we have processed that row.
So the EJB timer thread calls a procedure that grabs rows which are flagged "un-treated".
Dirty, but also very simple and robust. Especially, after a crash or something, it can still pick up from where it crashed without too much complexity.
The disadvantage is the wasted load on the DB, and also response time will be limited (probably requires seconds).
We have accomplished this in our firm by adding triggers to database tables that call an executable to issue a Tib Rendezvous message, which is received by all interested Java applications.
However, the ideal way to do this IMHO is to be in complete control of all database writes at the application level, and to notify any interested parties at this point (via multi-cast, Tib, etc). In reality this isn't always possible where you have a number of disparate systems.
You're indeed dependent on whether the database in question supports it. You'll also need to take the overhead into account. Lot of inserts/updates also means a lot of notifications and your Java code has to handle them consistently, else it will bubble up.
If the datamodel allows it, just add an extra column which holds a timestamp which get updated on every insert/update. Most major DB's supports an auto-update of the column on every insert/update. I don't know which DB server you're using, so I'll give only a MySQL-targeted example:
CREATE TABLE mytable (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
somevalue VARCHAR(255) NOT NULL,
lastupdate TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX (lastupdate)
)
This way you don't need to worry about inserting/updating the lastupdate yourself. You can just do an INSERT INTO mytable (somevalue) VALUES (?) or UPDATE mytable SET somevalue = ? WHERE id = ? and the DB will do the magic.
After ensuring that the DB server's time and Java application's time are the same, you can just fire a background thread (using either Timer with TimerTask, or ScheduledExecutorService with Runnable or Callable) which does roughly this:
Date now = new Date();
statement = connection.prepareStatement("SELECT id FROM mytable WHERE lastupdate BETWEEN ? AND ?");
statement.setDate(1, this.lastTimeChecked);
statement.setDate(2, now);
resultSet = statement.executeQuery();
while (resultSet.next()) {
// Handle accordingly.
}
this.lastTimeChecked = now;
Update: as per the question update it turns out that you have no control over the DB. Well, then you don't have much good/efficient options. Either just refresh the entire list in Java memory with entire data from DB without checking/comparing for changes (probably the fastest way), or dynamically generate a SQL query based on the current data which excludes the current data from the results.
I assume that you're talking about a situation where anything can update a table. If for some reason you're instead talking about a situation where only the Java application will be updating the table that's different. If you're using Java only you can put this code in your DAO or EJB doing the update (it's much cleaner than using a trigger in this case).
An alternative way to do this is to funnel all database calls through a web service API, or perhaps a JMS API, which does the actual database calls. Processes could register there to get a notification of a database update.
We have a similar requirement. In our case we have a legacy system that we do not want to adversely impact performance on the existing transaction table.
Here's my proposal:
A new work table with pk to transaction and insert timestamp
A new audit table that has same columns as transaction table + audit columns
Trigger on transaction table to dump all insert/update/deletes to an audit table
Java process to poll the work table, join to the audit table, publish the event in question and delete from the work table.
Question is: What do you use for polling? Is quartz overkill? How can you scale back the polling frequency based on the current DB load?

Categories

Resources