I have more of theoretical question:
When data gets inserted into a database? is it after persist or after commit is called? Because I have a problem with unique keys (manually generated) - they get duplicate. I'm thinking this is due multiple users inserting data simultaneously into a same table.
UPDATE 1:
I generate keys in my application. Keys example: '123456789123','123456789124','123456789125'...
Key field is varchar type, because there are lot of old keys (I can't delete or change them) like 'VP123456','VP15S3456'. Another problem, that after inserting them into one database, these keys have to be inserted in another database. And I don't know what are DB sequences and Atomic objects..
UPDATE 2:
These keys are used in finance documents and not as database keys. So they must be unique, but they are not used anywhere in programming as object keys.
I would suggest you create a Singleton that takes care of generating your keys. Make sure you can only get a new id once the singleton has initialized with the latest value from the database.
To safeguard you from incomplete inserts into the two databases I would suggest you try to use XA transactions. This will allow you to have all-or-nothing inserts and updates. So if any of the operations on any of the databases fails, everything will be rolled back. Of course there is a downside of XA transactions; they are quite slow and not all databases and database drivers support it.
How do you generate these keys? Have you tried using sequences in DB or atomic objects?
I'm asking because it is normal to populate DB concurrently.
EDIT1:
You can write a method that returns new keys based on atomic counter, this way you'll know that anytime you request a new key you receive a unique key. This strategy may and will lead to some keys being discarded but it is a small price to pay, unless it is a requirement that keys in the database are sequential.
private AtomicLong counter; //initialized somewhere else.
public String getKey(){
return "VP" + counter.incrementAndGet();
}
And here's some help on DB Sequences in Oracle, MySql, etc.
Related
I'm developing a REST API using Java and Spring Boot to manage purchases and customers. In my MySQL database, I have a table Purchase with a column that stores the unique ticketId. It is not the primary key.
When a new purchase is added (by doing a PUT request), I create a new purchase from data provided in the request, obtain the max ticketId, increment it by one, and store it in database. Primary key is auto-incremented.
This is my code:
#Transactional
public boolean saveNewPurchase(PurchaseDTO data) {
Purchase p = createPurchaseFromData(data);
Long idTicket = purchaseDao.getMaxIdTicket();
p.setIdTicket(idTicket + 1);
save(p);
}
Are there concurrency issues here? Let's say two PUT requests executes this method in parallel, could they retrieve the same max idTicket so violate unique idTicket constraint when save the second purchase?
If so, how could I solve it? Would making the method synchronized solve the problem?
Thanks.
Yes there is a concurrency issue here. Two different threads could get the same maxIdTicket, and then save the Purchase with the same ticketId.
I see 3 solutions here :
Use synchronized
Use an AtomicInteger to keep the counter in memory
Use a specific table in MySQL, with only one column, AUTO_INCREMENT and insert/get a row each time you need a counter value. With other RDBMS, you could use a sequence, but I am pretty sure there is no sequences in MySQL.
The first 2 solutions are not working in distributed environment, so I would do the third.
Yes, it is certainly prone to concurrency issues. One suggestion is to keep counter in memory with AtomicInteger, so that you wouldn't end up getting current maxId always from DB which might cause race conditions. So, when application starts up, by querying the DB, it could store the maxId in memory. This is robust, even in case of any crash, it could always the read the information from DB.
This won't work in distributed environment, in that case, having another table dedicated for storing the counter, is the better approach, as suggested by JamieB, in comments.
My application has quite a good number of tables in DB. What is the efficient way of key generation for Memcached? Because, whenever we update a table's data, we have to see if there is any cached data related to that table and clear it. Also I need to take care of join queries because if either of the tables involved in a cached join is modified, the cached data should be cleared too.
Key could be with the DB_TABLE NAME_PrimaryKey-field. Where the PrimaryKey-field is provided with the "primary key" of the table.
In the custom client class say CustomAppCache have inner class say CacheKeyGen this can be defined with the properties having database, tableName, primaryKeyField. Now the memcached will have the data with the key as DB_TABLE_NAME_PrimaryKey-field and the table data as the value.
While using the setCache set the data to the memcached with all the data of the table.
While using the getCache check for matching the pattern of the requisite and perform the intended operation like delete from the cache and reload it.
This should solve the key generation problem. Let me know if this solves your key gen problem.
Hi I am new to redis and want some help over here. I am using java and sql server 2008 and redis server. To interact with redis I am using jedis api for java. I know that redis is used to store key value based things. Every key has values.
Problem Background:
I have a table names "user" which stores data like id, name, email, age, country. This is schema of sql table. Now this table have some rows(means some data as well). Now here my primary key is id and its just for DB use Its of no use for me in application.
Now in sql I can insert new row, can update a row, can search for any user, can delete a user.
I want to store this tables data into redis. Then I want to perform similar operations on redis as well, like search, insert, delete. But if I have a good design on "Storing this info in DB and Redis" then these operations will be carried out simply. Remember I can have multiple tables as well. So should store data in redis on basis of table.
My Problem
Any design or info you can advise me that how I can convert DB data to Redis and perform all operations. I am asking this because I know Facebook is also using redis to store data. Then how they are storing data.
Any help would be very appreciative.
This is a very hard question to answer as there are multiple ways you could do.
The best way in my opinion would be use hashes. This is basically a nested a nested key-value type. So your key would match to the hash so you can store username, password, etc.
One problem is indexing, you would need to have an ID stored in the key. For example each user would have to have a key like: USER:21414
The second thing unless you want to look at commands like KEYS or SCAN you are going to have to maintain your own list of users to iterate, only if you need to do that. For this you will need to look at lists or sorted sets.
To be honest there is no true answer to this question, SQL style data does not map to key-value's in any real way. You usually have to do a lot more work yourself.
I would suggest reading as much as you can and would start here http://redis.io/commands and here http://redis.io/documentation.
I have no experience using Jedis so I can't help on that side. If you want an example I have an open-source social networking site which uses Redis as it's sole data store. You can take a look at the code to get some ideas https://github.com/pjuu/pjuu/blob/master/pjuu/auth/backend.py. It uses Python but Redis is such an easy thing to use everywhere there will not be that much to difference.
Edit: My site above no longer solely uses Redis. An older branch will need to be checked such as 0.4 or 0.3 :)
I'm working on an AppEngine project and I'm using JDO on top of the AppEngine datastore for persistence. I have an entity that uses an encoded string as the key and also uses an application generated keyname (also a string). I did this because my app would frequently scoop data (potentially scooping the same thing) from the wild and attempt to persist them. In an attempt to avoid persisting several entities which essentially contain the same data, I decided to hash some properties about these data so as to get a consistent keyname (not manipulating keys directly because of entity relationships).
The problem now is that whenever I calculate my hash (keyname) and attempt to store the entity, if it already exists in the datastore, the datastore (or JDO or whoever the culprit is) silently overwrites the properties of the entity in the datastore without raising any exception. This has serious effects on the app because it overrides the timeStamps (a field) of the entities (which we use for ordering).
How best can I get around this?
You need to do get-before-set (Check and set or CAS).
CAS is a fundamental tenant of concurrency, and it's a necessary evil of parallel computing.
Gets are much cheaper than sets anyway, so it may actually save you money.
Instead of blind writing to datastore, first retrieve; if the entity doesn't exist, catch the exception and just put the entity. If it does exist, do a deep compare before you save. If nothing has changed, don't persist it (and save that cost). If it has changed, choose your merge strategy however you please. One (slightly ugly) way to maintain dated revisions is to store the previous entity as a field in the updated entity (may not work for many revisions).
But, in this case, you have to get before set. If you don't expect many duplicates and want to be really chintzy, you can do an exists query first... Which is to do a keys-only count query on the key you want to use (costs 7x less than a full get). If (count() == 0) then put() else getAndMaybePut() fi
The count query syntax might look slow, but from my benchmarks, it's the fastest (and cheapest) possible way to tell if an entity exists:
public boolean exists(Key key){
Query q;
if (key.getParent() == null)
q = new Query(key.getKind());
else
q = new Query(key.getKind(), key.getParent());
q.setKeysOnly();
q.setFilter(new FilterPredicate(
Entity.KEY_RESERVED_PROPERTY, FilterOperator.EQUAL, key));
return 1 == DatastoreServiceFactory.getDatastoreService().prepare(q)
.countEntities(FetchOptions.Builder.withLimit(1));
}
You must do a get() to see if an entity with the same key exists before you put() the new entity. There is no way around doing this.
You can use memcache and local "in-memory" caching to speed up your get() operation. This may only help if you are likely to read the same information multiple times. If not, the memcache query may actually slow down your process.
To ensure that two requests do not overwrite each other you should use a transaction (not possible with a query as suggested by Ajax unless you put all items in a single entity group which may limit your updates to 1 per second)
In pseudo code:
Create Key from hashing data
Check in-memory cache for key (use a ConcurrentHashSet of keys), return if found
Check MemcacheService for key, return if found
Start transaction
Get entity from datastore, return if found
Create entity in datastore
Commit transaction, return if fails due to concurrent update
Put Key in cache (in-memory and memcache)
Step 7 will fail if another request (thread) has already written the same key at the same time.
What I suggest you is that instead of saving the ID as a string either use a Long ID for your entity or you may use Key datatype, which is auto generated by appengine.
#PersistenceCapable
public class Test{
#PrimaryKey
#Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Long ID;
// getter and setter
}
This will return a unique value to you everytime.
This is a use case in member enrollment via web application/web service. We have a complex algorithm for checking if a member is duplicate, by looking at multiple tables like phone,address etc. The algorithm varies based on member's country. So this restriction cannot be implemented using primary key/unique key constraint.
So we have the checks in Java code. But if there are 2 duplicate concurrent requests, the 2 Java threads see that the member doesn't exist and they both insert the record resulting in duplicates. How can I prevent such duplicate inserts?
I can prevent updates by using row level locks or Hibernate's optimistic concurrency. I can think of table level locks to prevent such inserts, but limits the application performance as it also blocks updates. Another option I think would be to create a lock table with a record with id='memberInsert', and force all inserts via JDBC to obtain a row level lock on this record.
Thanks
Suneel
If it's going to be anywhere, I'd expect it to be in a write trigger, not in the Java code. Some other application or some other area of the application could do something badly.
Offloading this on the database gives you two advantages. 1) It prevents the race condition you mention up there and 2) It protects the integrity of the data by not allowing some errant application to modify records putting them in an illegal state.
Can't you hash the outcome of the algorithm or something and simply use that as a unique primary key?
As long as the database is not aware of your requirements, it will not help you. And then you probably have no other choice than table level locking.