Using memcache in google app engine

Using memcache in google app engine - java

I have created an app using GAE. I am expecting 100k request daily. At present for each request app need to lookup 4 tables and 8 diff columns before performing needed task.
These 4 tables are my master tables having 5k,500, 200 and 30 records. It is under 1 MB (The limit).
Now I want to put my master records in memcache for faster access and reduce RPC call. When any user update master I'll replace the memcache object.
I need community suggestion about this.
Is it OK to change the current design?
How can I put 4 master table data in memcache?
Here is how application works currently
100 of users access same application page.
They provide a unique identification token and 3 more parameters (let's say p1, p2 and p3).
My servlet receives the request.
Application fetches user table by token and and check enable state.
Application fetches another table (say department) and checks for p1 existence. If exists check enable status.
If above return true, a service table is queried based on parameter p2 to check whether this service is enabled or not for this user and check Service EndDate.
Based on p3 length another table is checked for availability.

You shouldn't be thinking in terms of inserting tables into memcache. Instead, use an 'optimistic cache' strategy: any time you need to perform an operation that you want to cache, first attempt to look it up in memcache, and if that fails, fetch it from the datastore, then store in memcache. Here's an example:
def cached_get(key):
entity = memcache.get(str(key))
if not entity:
entity = db.get(key)
memcache.set(str(key), entity)
return entity
Note, though, that caching individual entities is fairly low return - the datastore is fairly fast at doing fetches. Caching query results or rendered pages will give a much better improvement in speed.

Related

Complexity of a query in the Google datastore

I have an Android app where users will be able to send private messages to each other. (for instance: A sends a message to B and C and the three of them may comment that message)
I use google app engine and the google datastore with Java. (framework Objectify)
I have created a Member entity and a Message entity which contains a ArrayList<String> field, representing the recipients'ids list. (that is to say the key field of the Member entity)
In order for a user to get all the messages where he is one of the recipients, I was planning on loading each Message entity on the datastore and then select them by checking if the ArrayList<String> field contains the user's id. However, considering there may be hundred of thousands messages stored, I was wondering if that is even possible and if that wouldn't take too much time?

The time to fetch results from the datastore only relates to the number of Entities retrieved, not to the total number of Entities stored because every query MUST use an index. That's exactly what makes the datastore so scalable.
You will have to limit the number of messages retrieved per call and use a Cursor to fetch the next batch. You can send the cursor over to the Android client by converting it to a websafe string, so the client can indicate the starting point for the next request.

Designing a count based access control

I would like to get some advice on designing a count based access control. For example I want to restrict the number of users that a customer can create in my system based on their account. So by default a customer can create 2 users but if the upgrade their account they get to create 5 users and so on.
There are a few more features that I need to restrict on a similar basis.
The application follows a generic model so every feature exposed has a backing table and we have a class which handles the CRUD operation on that table. Also the application runs on multiple nodes and has a distributed cache.
The approach that I am taking to implement this is as follows
- I have a new table which captures the functionality to control and the allowed limit (stored per customer).
- I intercept the create method for all tables and check if the table in question needs to have access control applied. If so I fetch the count of created entities and compare against the limit to decide if I should allow the creation or not.
- I am using the database to handle synchronization in case of concurrent requests. So after the create method is called I update the table using the following where clause
where ( count_column + 1 ) = #countInMemory#
. i.e. the update will succeed only if the value stored in the DB + 1 = value in memory. This will ensure that even if two threads attempt a create at the same time, only one of them will be able to successfully update. The thread that successfully updates wins and the other one is rolled back. This way I do not need to synchronize any code in the application.
I would like to know if there is any other / better way of doing this. My application runs on Oracle and MySQL DB.
Thanks for the help.

When you roll back, do you retry (after fetching the new user count) or do you fail? I recommend the former, assuming that the new fetched user count would permit another user.
I've dealt with a similar system recently, and a few things to consider: do you want CustomerA to be able to transfer their users to CustomerB? (This assumes that customers are not independent, for example in our system CustomerA might be an IT manager and CustomerB might be an accounting manager working for the same company, and when one of CustomerA's employees moves to accounting he wants this to be reflected by CustomerB's account.) What happens to a customer's users when the customer is deleted? (In our case another customer/manager would need to adopt them, or else they would be deleted.) How are you storing the customer's user limit - in a separate table (e.g. a customer has type "Level2," and the customer-type table says that "Level2" customers can create 5 users), or in the customer's row (which is more error prone, but would also allow a per-customer override on their max user count), or a combination (a customer has a type column that says they can have 5 users, and an override column that says they can have an additional 3 users)?
But that's beside the point. Your DB synchronization is fine.

Concurrent update in mysql row using trigger

I have a soap-based web service with Java + Mysql.
The web services consist in save and send as a response generated documents. Each user has a limited number of documents available. This service provide documents to external systems, so, i have to know the documents available any time for an specific user.
To improve this a build a trigger that updates the user row when a new document is created.
CREATE TRIGGER `Service`.`discount_doc_fromplan`
AFTER INSERT ON `Service`.`Doc` FOR EACH ROW
UPDATE `Service`.`User` SET User.DocAvailable = User.DocAvailable - 1 where User.id = NEW.idUser
The problem comes when an user tries to create 2 or more documents at the same time because of their systems. This give me a "Deadlock found when trying to get lock".
Somebody has an idea to improve this without the deadlock problem and at the same time the right number of documents available?. This is my first web service. Thanks.

You are trying to implement your business logic inside a database trigger. Instead of trigger, you can implement this logic in either (1) your web service application middleware or (2) a stored procedure. I prefer approach (1) though. The basic code in either will collect all inserts in Doc table by a user in a cumulative counter and at the end of all inserts, update the User table DocAvailable = DocAvailable -counter in one go. You can do this in a transaction so that you can rollback in case of a problem. You will have to read the available Doc quota for the user before starting the transaction.

Can any one explain how to Understanding Datastore Read Costs in App Engine?

I am doing geoquery among 300 user entities with a result range 10.
I've maked a query for 120 times. For each query I got 10 user entity objects.
After this my app engine read operations reached 52% (26000 operations).
My user entity has 12 single value properties and 3 multi-value properties(List type).
User entity have 2 indexes for single value properties and 2 indexes on list type properties.
Can any one please help me to understand how google's appengine counts the datastore read operations?

As a start, use appstats. It'll show you where your costs are coming from in your app:
https://developers.google.com/appengine/docs/java/tools/appstats
To keep your application fast, you need to know:
Is your application making unnecessay RPC calls? Should it cache data
instead of making repeated RPC calls to get the same data? Will your
application perform better if multiple requests are executed in
parallel rather than serially? The Appstats library helps you answer
these questions and verify that your application is using RPC calls in
the most efficient way by allowing you to profile your RPC calls.
Appstats allows you to trace all RPC calls for a given request and
reports on the time and cost of each call.
Once you understand where your costs are coming from you can optimise.
If you just want to know what the prices are, they are here:
https://developers.google.com/appengine/docs/billing

You can analyse what is going on under the hood with appstats: https://developers.google.com/appengine/docs/java/tools/appstats

Google App Engine JDO Query using Alternate logic for 'NOT IN'

I'm developing a Google App Engine Java app where users can search business objects from database based on search criteria.
The search results (a list of records) should not include any of the records (certain number of records, say 100) from their past searches. I'm storing the past results in the User Profile for this reason.
Any suggestions on efficiently implementing this logic (without using multiple collection iterations). I'm using JDO and there are restrictions in using 'NOT IN' condition in the queries.

Here's a solution, assuming your goal is to get 200 keys that are not in the history already.
I will attempt to estimate the number of operations used as a proxy for "efficiency", since this is how we will be charged in the new pricing model
Fetch the User object and "history keys" (1 read operation)
Do a keys only query and fetch 300 records. (300 small operations)
In your code, subtract any of the history keys from the 300 records. (0 operations)
If you end up with less than 200 records after step 3, fetch another 100.(repeat if necessary) (100 small operations).
Once you have 200 keys not seen before, you can fetch the full business object entities if you need them, or display the keys to the user. (200 read operations if you fetch the entire objects)
If the datastore supported a native "NOT IN" operator, then we could shave off 100 small operations from step 2, and skip step 4. The largest cost here will be fetching the actual 200 entities, which would have to happen with or without the NOT IN operator. Ultimately, this method is not that inefficient compared to what a native NOT IN operator would do.
Further optimizations:
If you don't need to display 200 keys all at once, then you can use cursors to only get N results at a time.
I am simply guessing when I suggest that you get 300 keys at first. You may need to get more or less. You can also probably get less than 100 on the second attempt.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.