I'm currently working on a RESTful web service that is reading and writing from two different databases. The problem I am running into is handling cases where my service gets called and then a second call is received before the first finishes writing. The service reads a date field in the database to determine if it needs to INSERT or UPDATE and then writes to that field in the chosen operation. If the second call is made before the first is finished, the date field will not have been written to, so I end up with two INSERTS rather than an INSERT and an UPDATE.
I tried using the concurrency API available in java as well as in groovy but so far I have not been able to get it to work. The RESTful service looks up a fresh copy of the model class each time it is called. The model class then gets a new instance of the groovy object via dependency injection. As a result, I can't put an instance of the Lock in either place since each call of the RESTful service will be working on a new instance of the model and the groovy.
Can anyone suggest a better way to do this? Any help would be appreciated.
Update
Here is some pseudo code for the service
Lookup model data by id from table A of database 1.
Lookup the most recent entry in table B of database 2 where the id matches a key stored in the model (dw_id).
Compare the 'start date' column of the results to the current datetime.
If the 'day' of the start_date == the current day then:
4a. execute an UPDATE query on table B of database 2 using data
obtained from the model.
Else:
5a. Execute an UPDATE query on table B of database 2, replacing the
value of the end_date column with yesterday's date where id == dw_id.
5b. Execute an INSERT on table B of database 2, using data from the
model, setting start_date to today's date and the end_date to a
constant future date.
5c. Execute an UPDATE on table A of database 1, replacing the dw_id of
the model with the auto-generated id of the entry created by the
INSERT from 5b.
UPDATE 2
I ended up implementing something similar to the solution proposed by jan-willem-gmelig-meyling. I found a good example of an implementation here: https://stackoverflow.com/a/27806218/32453 that I was able to adapt to suit my needs.
In Java you can do this by synchronising on a lock object. This however only works if you acquire this lock before starting a database transaction. Otherwise, the two parallel threads will start a database transaction simultaneously and live in isolated database versions. So what you roughly get:
private final static Object lock = new Object();
public void myResource() {
synchronized (lock) {
// begin tx, do work, end tx.
// Other callee won't get here before first callee is finished
}
}
This however does break all server stateless conventions. It is much better to actually lock the database table in your transaction. This prevents another connection from simultaneously reading from and writing to the same table. You can investigate which locking mechanisms are supported for your database (or ORM library, if using any). A technique such as JTA can be utilised to share database transactions between servers, services, or multiple databases.
Related
I am developing a java application that loads certain things from a database, such as client records and product info. When the user navigates to say the 'products' tab, I query for products in the database and update a table with that information.
I am wondering if there is a way to see if the query results have changed since the last check, in order to avoid querying and loading all info from the database, and instead just load updates. Is there a way to do this, or perhaps just load changes only from a query into my table list? My goal is to make the program run faster when switching between tabs.
I am wondering if there is a way to see if the query results have changed since the last check
Stated differently, you want a way to automatically answer the question “is this the same result?” without retrieving the entire result.
The general approach to this problem would be to come up with some fast-to-query proxy for the entire state of the result set, and query that instead.
Once you have determined a stable fast computation for the entire result set, you can compute that any time the relevant data changes; and only poll that stored proxy to see whether the data has changed.
For example, you could say that “the SHA-256 hash of fields lorem, ipsum, and dolor” is your proxy. You can now:
Implement that computation inside the database as a function, maybe products_hash.
Create a latest_products_hash table, that stores created timestamp and products_hash that was computed at that time.
In your application, retrieve the most recent record from latest_products_hash and keep it for reference.
In the database, have a scheduled job, or a trigger on some event you decide makes sense, that will compute and store the products_hash in latest_products_hash automatically without any action from the application.
To determine whether there have been updates yet, the application will query the latest_products_hash table again and compare its most recent record with the one the application stored for reference.
Only if the latest_products_hash most-recent value is different, then query the products table and get the full result set.
That way, the application is polling a much faster query (the most-recent record in latest_products_hash) frequently, and avoiding the full products query until it knows the result set will be new.
HELP!
the case is when two or more transactions are trying to affect the same client monetary account in some external system. I need the second transaction be performed until the first one has finished.
consider:
- there are two or more transactions trying to affect the same balance
- there are multiple clients at the same time
- working with 1000 TPS with 100ms avg per transaction
ideas:
- as we are working with multi threads to support 1000TPS i'm trying to create Queues based on the Client ID. using some kind of workmanager that limit one thread by Client. So if i have 2 request with the same clientID at same time dynamically can queue the second.
tools
i'm trying to use Oracle tools for example:
- Fusion Middleware: using the Workmanager based on message context [not sure if possible because looks like context can be based only on session data] i like WorkManager because has no performance issues
- Oracle OCEP: creating a dynamic queue using the CQL [ not sure if possible and performance]
- Oracle Advance Queuing: maybe possible with transactions group.
thanks for any idea
I hope I got your problem.
In you question you asked, wether it is possible to perform a second transaction on a row before the first is completed. This is impossible! A database which follows the ACID paradigm has to be Consistent! So you can't "overtake" the first transaction!!! If you want to do that, you should use NoSQL Databases (like MongoDB, ...) where consistency is not that strong.
But maybe you want to know, if there is a Oracle view to figure out, wether a row is locked or not? Let's assume, that there is a view like that. You would check this view and if there is no lock, you start your update/delete. But you can't be sure that this will work because even 1ms after you checked it, another process can put a lock on it.
The only thing you can do is, to put a "select ... for update NOWAIT" before you UPDATE/DELETE statement.
If the row is locked, you will get a exception (ORA-00054: Resource busy). This is the recommended/"out of the box way" to let the database manage row-level-locking for you!
See the following example with the emp table. Consider: to check this out, start this code in two different sessions at the same time.
declare
l_sal number;
resource_busy exception; -- declare your own exception
pragma exception_init (resource_busy, -54); -- connect your exception with ORA-00054
begin
select sal
into l_sal
from emp
where empno = 7934
for update NOWAIT;
update emp
set sal = sal + 100
where empno = 7934;
exception
when resource_busy then
null; -- in your case, simply do nothing, if the row is locked
end;
Gentlemen/ladies,
I've got a problem with concurrent updates of the same entity.
Process 1 obtains collection of objects. This process doesn't use Hibernate to retrieve data for the sake of performance which sounds a bit far-fetched for me. This process also updates some fields of some objects from the collection using Hibernate.
Process 2 obtains an object similar to one of those in collection (basically the same row in DB) and updates it somehow. This process uses Hibernate.
Since process 1 and process 2 don't know about each other they can update the same entity, leaving it in non-consistent state.
For example:
process 1 obtains collection
process 2 obtains one entity and removes some of its properties along with an entity it was linking to
process 1 gets back and tries to save that entity and gets entity not found exception
I need to deal with this situation.
So what can be done?
For now I see two ways:
create layer above database that will keep track of every entity in the system effectively prohibiting from creating multiple instances of same entity
set up optimistic locks and since some entities are not obtained by Hibernate I need to implement it somehow different
Any ideas would be very helpful
Thanks in advance
Since process 1 and process 2 don't know about each other they can update the same entity, leaving it in non-consistent state.
I'd reformulate that: both processes can update the same data. Only Hibernate would know the entities while the other process seems to access the data via JDBC.
I'd go for option 2 which would involve a version column in your entities.
IIRC Hibernate would then add a WHERE version = x condition to the queries and check whether all rows have been updated and if not an OptimistictLockException would be thrown. You could do the same in your JDBC queries, i.e. UPDATE ... SET ... version = x + 1 ... WHERE version = x AND additionalConditions and check the number of updates rows that is returned by JDBC.
I'm working with a code base that is new to me, and it uses iBatis.
I need to update or add to an existing table, and it may involve 20,000+ records.
The process will run once per day, and run in the middle of the night.
I'm getting the data from a web services call. I plan to get the data, then populate one model type object per record, and pass each model type object to some method that will read the data in the object, and update/insert the data into the table.
Example:
ArrayList records= new ArrayList();
Foo foo= new Foo();
foo.setFirstName("Homer");
foo.setLastName("Simpson");
records.add(foo);
//make more Foo objects, and put in ArrayList.
updateOrInsert(records); //this method then iterates over the list and calls some method that does the updating/inserting
My main question is how to handle all of the updating/inserting as a transaction. If the system goes down before all of the records are read as used to update/insert the table, I need to know, so I may go back to the web services call and try again when the system is ok.
I am using Java 1.4, and the db is Oracle.
I would highly recommend you consider using spring batch - http://static.springsource.org/spring-batch/
The framework provides lot of the essential features required for batch processing - error reporting, transaction management, multi-threading, scaling, input validation.
The framework is very well designed and very easy to use.
The approach you have listed might not perform very well, since you are waiting to read all objects, storing them all in memory and then inserting in the database.
You might want to consider designing the process as follows:
Create a cache capable of storing 200 objects
Invoke the webservice to fetch the data
Create an instance of an object, validate and store the data in the object's fields
Add the object to the cache.
When the cache is full, perform a batch commit of the objects in the cache to the database
Continue with step 1
SpringBatch will allow you to perform batch commits, control the size of the batch commits, perform error handling when reading input - in your case retry the request, perform error handling while writing the data to the database.
Have a look at it.
I am stuck at some point wherein I need to get database changes in a Java code. Request is to get any record updated, added, deleted in any table of db; should be recognized by Java program. How could it be implemented JMS? or a Java thread?
Update: Thanks guys for your support i am actually using Oracle as DB and Weblogic 10.3 workshop. Actually I want to get the updates from a table in which I have only read permission so guys what do you all suggest. I can't update the DB. Only thing I can do is just read the DB and if there is any change in the table I have to get the information/notification that certain data rows has been added/deleted or updated.
Unless the database can send a message to Java, you'll have to have a thread that polls.
A better, more efficient model would be one that fires events on changes. A database that has Java running inside (e.g., Oracle) could do it.
We do it by polling the DB using an EJB timer task. In essence, we have a status filed which we update when we have processed that row.
So the EJB timer thread calls a procedure that grabs rows which are flagged "un-treated".
Dirty, but also very simple and robust. Especially, after a crash or something, it can still pick up from where it crashed without too much complexity.
The disadvantage is the wasted load on the DB, and also response time will be limited (probably requires seconds).
We have accomplished this in our firm by adding triggers to database tables that call an executable to issue a Tib Rendezvous message, which is received by all interested Java applications.
However, the ideal way to do this IMHO is to be in complete control of all database writes at the application level, and to notify any interested parties at this point (via multi-cast, Tib, etc). In reality this isn't always possible where you have a number of disparate systems.
You're indeed dependent on whether the database in question supports it. You'll also need to take the overhead into account. Lot of inserts/updates also means a lot of notifications and your Java code has to handle them consistently, else it will bubble up.
If the datamodel allows it, just add an extra column which holds a timestamp which get updated on every insert/update. Most major DB's supports an auto-update of the column on every insert/update. I don't know which DB server you're using, so I'll give only a MySQL-targeted example:
CREATE TABLE mytable (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
somevalue VARCHAR(255) NOT NULL,
lastupdate TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX (lastupdate)
)
This way you don't need to worry about inserting/updating the lastupdate yourself. You can just do an INSERT INTO mytable (somevalue) VALUES (?) or UPDATE mytable SET somevalue = ? WHERE id = ? and the DB will do the magic.
After ensuring that the DB server's time and Java application's time are the same, you can just fire a background thread (using either Timer with TimerTask, or ScheduledExecutorService with Runnable or Callable) which does roughly this:
Date now = new Date();
statement = connection.prepareStatement("SELECT id FROM mytable WHERE lastupdate BETWEEN ? AND ?");
statement.setDate(1, this.lastTimeChecked);
statement.setDate(2, now);
resultSet = statement.executeQuery();
while (resultSet.next()) {
// Handle accordingly.
}
this.lastTimeChecked = now;
Update: as per the question update it turns out that you have no control over the DB. Well, then you don't have much good/efficient options. Either just refresh the entire list in Java memory with entire data from DB without checking/comparing for changes (probably the fastest way), or dynamically generate a SQL query based on the current data which excludes the current data from the results.
I assume that you're talking about a situation where anything can update a table. If for some reason you're instead talking about a situation where only the Java application will be updating the table that's different. If you're using Java only you can put this code in your DAO or EJB doing the update (it's much cleaner than using a trigger in this case).
An alternative way to do this is to funnel all database calls through a web service API, or perhaps a JMS API, which does the actual database calls. Processes could register there to get a notification of a database update.
We have a similar requirement. In our case we have a legacy system that we do not want to adversely impact performance on the existing transaction table.
Here's my proposal:
A new work table with pk to transaction and insert timestamp
A new audit table that has same columns as transaction table + audit columns
Trigger on transaction table to dump all insert/update/deletes to an audit table
Java process to poll the work table, join to the audit table, publish the event in question and delete from the work table.
Question is: What do you use for polling? Is quartz overkill? How can you scale back the polling frequency based on the current DB load?