My application has quite a good number of tables in DB. What is the efficient way of key generation for Memcached? Because, whenever we update a table's data, we have to see if there is any cached data related to that table and clear it. Also I need to take care of join queries because if either of the tables involved in a cached join is modified, the cached data should be cleared too.
Key could be with the DB_TABLE NAME_PrimaryKey-field. Where the PrimaryKey-field is provided with the "primary key" of the table.
In the custom client class say CustomAppCache have inner class say CacheKeyGen this can be defined with the properties having database, tableName, primaryKeyField. Now the memcached will have the data with the key as DB_TABLE_NAME_PrimaryKey-field and the table data as the value.
While using the setCache set the data to the memcached with all the data of the table.
While using the getCache check for matching the pattern of the requisite and perform the intended operation like delete from the cache and reload it.
This should solve the key generation problem. Let me know if this solves your key gen problem.
Related
I have to process an xml file and for that I need to get around ~4k objects using it's primary key from a single table . I am using EhCache. I have few queries as follows:
1) It is taking lot of time if I am querying row by row based on Id and saving it in Cache . Can I query at initial point of time and save whole table in EHCache and can query it using primary key later in the processing
2) I dont want to use Query cache. As I can't load 4k objects at a time and loop it for finding correct object.
I am looking for optimal solution as right now my process is taking around 2 hours (it involves other processing too)
Thank you for your kind help.
You can read the whole table and store it in a Map<primary-key, table-row> to reduce the overhead of the DB connection.
I guess a TreeMap is probably the best choice, it makes search for elements faster.
Ehcache is great to handle concurrence, but if you are reading the xml with a single process you don't even need it (just store the Map in memory).
I'm using Hibernate Envers for my revision history.
This is my table setup:
CREATE TABLE EPIC (
epicid SERIAL NOT NULL,
accountid BIGINT NOT NULL,
description TEXT NOT NULL UNIQUE,
epicowner TEXT NOT NULL,
PRIMARY KEY(epicid)
);
CREATE TABLE EPIC_AUD(
epicid BIGINT NOT NULL ,
REV BIGINT NOT NULL,
accountid BIGINT,
description TEXT,
epicowner TEXT,
REVTYPE BIGINT,
PRIMARY KEY(epicid,REV)
);
Currently when i make changes it only saves the composite primary key values and the revision type. Since i also want to log the user who deleted some entity, i want to save that value too. This is the code i'm using for deleting the entity.
#Override
public boolean deleteItem(Epic epicFromFrontend) {
transactionBegin();
Epic epicToRemove = getEntityManager().find(Epic.class, epicFromFrontend.getEpicid());
epicToRemove.setAccountid(epicFromFrontend.getAccountid());
getEntityManager().remove(epicToRemove);
return transactionEnd();
}
Actually i have 2 questions:
How to save the accountid too
Or is it maybe smarter and better to save ALL data. so i have no empty fields in my EPIC_AUD table after a delete.
It is a common practice to capture various additional pieces of information that is audit-specific during the insert, update, or delete of your domain entities.
A simple yet intrusive way is to store that state in the same structure as the entity, as suggested by Marcin H. While this approach may work, there are several problems with this approach.
Mixing Concerns
The problem here is that historical related information is now being stored right along side the domain specific data. Much like security, auditing is a cross cutting concern and thus should be treated in the same way when it comes to data structures. Additionally, as multiple audited rows in your schema are manipulated, you often represent the same user, timestamp, etc across multiple tables which lead to unnecessary schema and table bloat.
Unnecessary fields / operations for data removal
When you store fields of this calibur on the entity itself, it introduces an interesting set of requirements as a part of the entity removal process. If you want Envers to track that removal user then you either have to perform an entity update with that user prior to removal or introduce an additional column to track whether a row is soft deleted, as suggested by Marcin H.This approach means that a table will always grow indefinitely, even when delete has been removed. It could have negative impacts to long-term query performance and other various concerns. Ideally, if data is no longer relevant except from a historical purpose and no FK relationships continue to exist that must be maintained, its far better to remove the row from the non-audit table.
Rather than the above, I suggest using this strategy I posted here that describes how to leverage a custom RevisionEntity data structure with Envers, allowing you to track multiple columns of data that is pertinent to the current transaction operation.
This approach has the following added benefits:
No Envers (audit) specific code littered across your DAO methods. Your DAO methods continue to focus on the domain specific operation only, as it should be.
In situations where multiple entities are manipulated during a single transaction, you now only capture the various audit-attributes once per transaction (aka once per revision). This means if the user adds, removed, and updates various rows, they'll all be tagged once.
You now can easily track the person who performed the row deletion because the audit attributes are kept on the RevisionEntity, which will be generated for the deletion. No special operations or fields are needed to handle this case. Furthermore, you can enable storing the entity snapshot at deletion and then have access to (1) who deleted the row and (2) what the row looked like prior to the removal too.
You can add attribue record_active boolean to your table epic, as well as to table epic_aud of course.
When record_active is false it means record has been "deleted".
And never remove any record physically - it's good practice in fact :)
we have a plan to cache DB table on application side (to avoid DB calls). Our cache is key & value pair implementation. If I use primary key (column1) as key and all other data as value, how can we execute below queries against cache?
select * from table where column1=?
select * from table where
column2=? and column3=? select * from table where column4=? and
column5=? and column6=?
One simplest option is to build 3 caches as below.
(column1) --> Data (column2+column3) --> Data (column4+column5) -->
Data
Any other better options?
Key points:
Table contains millions of records
We are using Java ConcurrentHashMap for cache implementation.
Looks like you want an in-memory cache. Guava has cool caches--you would need a LoadingCache.
Here is the link to LoadingCache
Basically, for your problem, the idea would be to have three LoadingCache. LoadingCache has a method that you should implement. The method tells loading cache given the input, how to get the data in case of a cache miss. So, the first time you access the loading cache for query1, there would be a cache miss. The loading cache would use the method you implemented (your classic DAO method) to get the data, put it in the cache, and return it to you. The next time you access it, it will be served from your in-memory guava cache.
So if you have three methods
Data getData(Column1 column)
Data getData(Column2 column2, Column3 column3)
Data getData(Column4 column4, Column5 column5, Column6 column6)
your three LoadingCache will call these methods from the load implementation you write. And that's it. I find it very clean and simple to get what you want.
You mentioned that you have to cache millions of records. Thats quite a big number. I do not recommened you building your own caching framework, especially not based on simplistic datastructures such as HashMaps.
I highly recommend Redis - Check at http://redis.io. Companies such as Twitter, Stackoverflow etc are using Redis for their caches.
Here is the live demonstration of Redis - http://try.redis.io
I have tagged this problem with both Oracle and Java because both Oracle and Java solutions would be accepted for this problem.
I am new to Oracle security and have been presented with the below problem to solve. I have done some research on the internet but I have had no luck so far. At first, I thought Oracle TDE might be helpful for my problem but here: Can Oracle TDE protect data from the DBA? it seems TDE doesn't protect data against DBA and this is an issue which is not to be tolerated.
Here is the problem:
I have a table containing millions of records. I have a Java application which queries this table using equality or range criteria against a column in the table which is the primary key column of the table. The primary key column contains sensitive data and thus has been encrypted already. As the result, querying data using normal (i.e. decrypted) values from the application cannot use the primary key's unique index access path. I need to improve the queries' performance without any changes on the application code (application config can be modified if necessary but not the code). It would be OK to do any changes that are necessary on the database side as long as that column remains encrypted.
Oracle people please: What solution(s) do you suggest to this problem? How can I create an index on decrypted column values and somehow force Oracle to utilize this index? How can I use partitioning such as hash-partitioning? How about views? Any, Any solution?
Java people please: I myself have this very vague idea which is to create a separate application in between (i.e between the database and the application) which acts as a proxy that receives the queries from the application and replaces the decrypted values with encrypted values and sends it for the database, it then receives the response and return the results back to the application. The proxy should behave like a database so that it should be possible for the application to connect to it by changing the connection string in the configuration file only. Would this work? How?
Thanks for all your help in advance!
which queries this table using equality or range criteria against a column in the table which is the primary key column of the table
To find a specific value it's simple enough - you can store the data encrypted any way you like - even as a hash and still retrieve a specific value using an index. But as per my comment elsewhere, you can't do range queries without either:
decrypting each and every row in the table
or
using an algorithm that can be cracked in a few seconds.
Using a linked list (or a related table) to define order instead of an algorithm with intrinsic ordering would force a brute force check on a much larger set of values - but it's nowhere near as secure as a properly encrypted value.
It doesn't matter if you use Oracle, Java or pencil and paper. Might be possible using quantum computing - but if you can't afford to ensure the security of your application / pay for good advice from an expert cryptographer, then you certainly won't be able to afford that.
How can I create an index on decrypted column values and somehow force Oracle to utilize this index?
Maybe you could create a function based index in which you index the decrypted value.
create index ix1 on tablename (decryptfunction(pk1));
I have more of theoretical question:
When data gets inserted into a database? is it after persist or after commit is called? Because I have a problem with unique keys (manually generated) - they get duplicate. I'm thinking this is due multiple users inserting data simultaneously into a same table.
UPDATE 1:
I generate keys in my application. Keys example: '123456789123','123456789124','123456789125'...
Key field is varchar type, because there are lot of old keys (I can't delete or change them) like 'VP123456','VP15S3456'. Another problem, that after inserting them into one database, these keys have to be inserted in another database. And I don't know what are DB sequences and Atomic objects..
UPDATE 2:
These keys are used in finance documents and not as database keys. So they must be unique, but they are not used anywhere in programming as object keys.
I would suggest you create a Singleton that takes care of generating your keys. Make sure you can only get a new id once the singleton has initialized with the latest value from the database.
To safeguard you from incomplete inserts into the two databases I would suggest you try to use XA transactions. This will allow you to have all-or-nothing inserts and updates. So if any of the operations on any of the databases fails, everything will be rolled back. Of course there is a downside of XA transactions; they are quite slow and not all databases and database drivers support it.
How do you generate these keys? Have you tried using sequences in DB or atomic objects?
I'm asking because it is normal to populate DB concurrently.
EDIT1:
You can write a method that returns new keys based on atomic counter, this way you'll know that anytime you request a new key you receive a unique key. This strategy may and will lead to some keys being discarded but it is a small price to pay, unless it is a requirement that keys in the database are sequential.
private AtomicLong counter; //initialized somewhere else.
public String getKey(){
return "VP" + counter.incrementAndGet();
}
And here's some help on DB Sequences in Oracle, MySql, etc.