Caching with Google App Engine

Caching with Google App Engine - java

When using the App Engine datastore for storing entities, what is the applied technique for caching.
I mean, without caching we simply do, something like this:
DatastoreService _ds = DatastoreServiceFactory.getDatastoreService();
public void put(String key, String value){
try {
Entity e = new Entity(createKey(key));
e.setProperty("key", key);
e.setProperty("value", value);
_ds.put(e);
} catch (Exception e) {
// handle exception
}
}
So where does caching kicks in? Also how does caching play during get methods.
Update:
Simply put my question would be when to do caching. My basic
implementation does not do caching at all, just plain put and get to
the Datastore.
Should caching be implemented on the lowest level API in my code or in a high level API, in my case, the lowest level API I have is this, the put and get to the Datastore.

There are really two kinds of caching to think about in app engine: memcache for ds entities and edge caching for static assets. This video from google covers both nicely with specific code examples:
Google I/O 2012 - Optimizing Your Google App Engine App
For edge caching you can also check out this post by Brandon Wirtz, as the documentation is a bit thin: enabling edge caching

Related

Highlighting in Hibernate Search 6 and Elasticsearch backend

We're in the process of converting our java application from Hibernate Search 5 to 6 with an Elasticsearch backend.
For some good background info, see How to do highlighting within HibernateSearch over Elasticsearch for a question we had when upgrading our highlighting code from a Lucene to Elasticsearch backend and how it was resolved.
Hibernate Search 6 seems to support using 2 backends at the same time, Lucene and Elasticsearch, so we'd like to use Elasticsearch for all our queries and Lucene for the highlighting, if that's possible.
Here is basically what we're trying to do:
public boolean matchPhoneNumbers() {
String phoneNumber1 = "603-436-1234";
String phoneNumber2 = "603-436-1234";
LuceneBackend luceneBackend =
Search.mapping(entityManager.getEntityManagerFactory())
.backend().unwrap(LuceneBackend.class);
Analyzer analyzer = luceneBackend.analyzer("phoneNumberKeywordAnalyzer").get();
//... builds a Lucene Query using the analyzer and phoneNumber1 term
Query phoneNumberQuery = buildQuery(analyzer, phoneNumber1, ...);
return isMatch("phoneNumberField", phoneNumber2, phoneNumberQuery, analyzer);
}
private boolean isMatch(String field, String target, Query sourceQ, Analyzer analyzer) {
Highlighter highlighter = new Highlighter(new QueryScorer(sourceQ, field));
highlighter.setTextFragmenter(new NullFragmenter());
try {
String result = highlighter.getBestFragment(analyzer, field, target);
return StringUtils.hasText(result);
} catch (IOException e) {
...
}
}
What I've attempted so far is to configure two separate backends in the configuration properties, per the documentation, like this:
properties.setProperty("hibernate.search.backends.elasticsearch.analysis.configurer", "com.bt.demo.search.AnalysisConfigurer");
properties.setProperty("hibernate.search.backends.lucene.analysis.configurer", "com.bt.demo.search.CustomLuceneAnalysisConfigurer");
properties.setProperty("hibernate.search.backends.elasticsearch.type", "elasticsearch");
properties.setProperty("hibernate.search.backends.lucene.type", "lucene");
properties.setProperty("hibernate.search.backends.elasticsearch.uris", "http://127.0.0.1:9200");
The AnalysisConfigurer class implements ElasticsearchAnalysisConfigurer and
CustomLuceneAnalysisConfigurer implements from LuceneAnalysisConfigurer.
Analyzers are defined twice, once in the Elasticsearch configurer and again in the Lucene configurer.
I don't know why both hibernate.search.backends.elasticsearch.type and hibernate.search.backends.lucene.type are necessary but if I don't include the lucene.type, I get Ambiguous backend type: configuration property 'hibernate.search.backends.lucene.type' is not set.
But if I do have both backend properties types set, I get
HSEARCH000575: No default backend. Check that at least one entity is configured to target the default backend, when attempting to retrieve the Lucene backend, like:
Search.mapping(entityManager.getEntityManagerFactory())
.backend().unwrap(LuceneBackend.class);
And the same error when trying to retrieve the Elasticsearch backend.
I've also added #Indexed(..., backend = "elasticsearch") to my entities since I wish to have them saved into Elasticsearch and don't need them in Lucene. I also tried adding a fake entity with #Indexed(..., backend = "lucene") but it made no difference.
What have I got configured wrong?

I don't know why both hibernate.search.backends.elasticsearch.type and hibernate.search.backends.lucene.type are necessary but if I don't include the lucene.type, I get Ambiguous backend type: configuration property 'hibernate.search.backends.lucene.type' is not set.
That's because the backend name is just that: a name. Hibernate Search doesn't infer particular information from it, even if you name your backend "lucene" or "elasticsearch". You could have multiple Elasticsearch backends for all it knows :)
But if I do have both backend properties types set, I get HSEARCH000575: No default backend. Check that at least one entity is configured to target the default backend, when attempting to retrieve the Lucene backend, like:
Search.mapping(entityManager.getEntityManagerFactory())
.backend().unwrap(LuceneBackend.class);
``
You called .backend(), which retrieves the default backend, i.e. the backend that doesn't have a name and is configured through hibernate.search.backend.* instead of hibernate.search.backends.<somename>.* (see https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#configuration-structure ).
But you are apparently mapping all your entities to a named backends, one named elasticsearch and one named lucene. So the default backend just doesn't exist.
You should call this:
Search.mapping(entityManager.getEntityManagerFactory())
.backend("lucene").unwrap(LuceneBackend.class);
I've also added #Indexed(..., backend = "elasticsearch") to my entities since I wish to have them saved into Elasticsearch
Since you obviously only want to use one backend for indexing, I would recommend reverting that change (keeping #Indexed without setting #Indexed.backend) and simply making using the default backend.
In short, remove the #Indexed.backend and replace this:
properties.setProperty("hibernate.search.backends.elasticsearch.analysis.configurer", "com.bt.demo.search.AnalysisConfigurer");
properties.setProperty("hibernate.search.backends.lucene.analysis.configurer", "com.bt.demo.search.CustomLuceneAnalysisConfigurer");
properties.setProperty("hibernate.search.backends.elasticsearch.type", "elasticsearch");
properties.setProperty("hibernate.search.backends.lucene.type", "lucene");
properties.setProperty("hibernate.search.backends.elasticsearch.uris", "http://127.0.0.1:9200");
With this
properties.setProperty("hibernate.search.backend.analysis.configurer", "com.bt.demo.search.AnalysisConfigurer");
properties.setProperty("hibernate.search.backends.lucene.analysis.configurer", "com.bt.demo.search.CustomLuceneAnalysisConfigurer");
properties.setProperty("hibernate.search.backend.type", "elasticsearch");
properties.setProperty("hibernate.search.backends.lucene.type", "lucene");
properties.setProperty("hibernate.search.backend.uris", "http://127.0.0.1:9200");
You don't technically have to do that, but I think it will be simpler in the long term. It keeps the Lucene backend as a separate hack that doesn't affect your whole application.
I also tried adding a fake entity with #Indexed(..., backend = "lucene")
I confirm you will need that fake entity mapped to the "lucene" backend, otherwise Hibernate Search will not create the "lucene" backend.

implementing cache on filter apis spring boot

I am working on a spring boot app where I have multiple fetch apis which are basically filter apis taking in params and sending response from db.
Now under load they are acting pretty slow,Is there any way I can fasten these with cache?
Can filter apis results be cached? as they may have different filters everytime.
Currently I did this:
#Cacheable(value = "sku-info-cache", unless = "#result == null")
public SkuGroupPagedResponseMap fetchSkuGroupsByDatesAndWarehouseId(Integer warehouseId,
Integer pageNumber,
Integer pageSize,
String startDate,
String endDate){
log.info("fetching from db");
SkuGroupPagedResponseMap skuGroupPagedResponseMap = locationInventoryClientService.fetchSkuGroupsByDatesAndWarehouseId(warehouseId,pageNumber,pageSize,startDate,endDate);
updateLotDetailsInSkuGroup(skuGroupPagedResponseMap);
return skuGroupPagedResponseMap;
}

The best way to handle this particular scenario is using a smart key. As per your case, you can make a smart key using the combination of requested filter parameters which will lead to formation of 5! combination of keys (in your case) which can be updated at time of database update using prefix deletion strategy of cache update and hence proves to be very fast. I have tried this and found to be very fast.

How to upload form data to google app engine

To upload data to the datastore I use this java code :
DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
Entity entity = new Entity("mydetail");
entity.setProperty("entry", "entry");
ds.put(entity);
For uploading form based data is this the correct method of uploading data, ie using similar code above or is there another API I should be using ?

Yes, this the direct API to the AppEngine Datastore.
You can also use JDO interface which allows for directly storing a Java object without dealing with the Datastore API:
import javax.jdo.annotations.Persistent;
#PersistenceCapable
public class MyDetail {
// ...
#Persistent
private String entry;
// ...
There is also the JPA interface. Both of the interfaces are described on the App Engine website.
The Objectify interface is very easy and for many situations easier. It is not part of the official SDK.
You can use whichever makes more sense for you application.

How to refresh JPA entities when backend database changes asynchronously?

I have a PostgreSQL 8.4 database with some tables and views which are essentially joins on some of the tables. I used NetBeans 7.2 (as described here) to create REST based services derived from those views and tables and deployed those to a Glassfish 3.1.2.2 server.
There is another process which asynchronously updates contents in some of tables used to build the views. I can directly query the views and tables and see these changes have occured correctly. However, when pulled from the REST based services, the values are not the same as those in the database. I am assuming this is because JPA has cached local copies of the database contents on the Glassfish server and JPA needs to refresh the associated entities.
I have tried adding a couple of methods to the AbstractFacade class NetBeans generates:
public abstract class AbstractFacade<T> {
private Class<T> entityClass;
private String entityName;
private static boolean _refresh = true;
public static void refresh() { _refresh = true; }
public AbstractFacade(Class<T> entityClass) {
this.entityClass = entityClass;
this.entityName = entityClass.getSimpleName();
}
private void doRefresh() {
if (_refresh) {
EntityManager em = getEntityManager();
em.flush();
for (EntityType<?> entity : em.getMetamodel().getEntities()) {
if (entity.getName().contains(entityName)) {
try {
em.refresh(entity);
// log success
}
catch (IllegalArgumentException e) {
// log failure ... typically complains entity is not managed
}
}
}
_refresh = false;
}
}
...
}
I then call doRefresh() from each of the find methods NetBeans generates. What normally happens is the IllegalArgumentsException is thrown stating somethng like Can not refresh not managed object: EntityTypeImpl#28524907:MyView [ javaType: class org.my.rest.MyView descriptor: RelationalDescriptor(org.my.rest.MyView --> [DatabaseTable(my_view)]), mappings: 12].
So I'm looking for some suggestions on how to correctly refresh the entities associated with the views so it is up to date.
UPDATE: Turns out my understanding of the underlying problem was not correct. It is somewhat related to another question I posted earlier, namely the view had no single field which could be used as a unique identifier. NetBeans required I select an ID field, so I just chose one part of what should have been a multi-part key. This exhibited the behavior that all records with a particular ID field were identical, even though the database had records with the same ID field but the rest of it was different. JPA didn't go any further than looking at what I told it was the unique identifier and simply pulled the first record it found.
I resolved this by adding a unique identifier field (never was able to get the multipart key to work properly).

I recommend adding an #Startup #Singleton class that establishes a JDBC connection to the PostgreSQL database and uses LISTEN and NOTIFY to handle cache invalidation.
Update: Here's another interesting approach, using pgq and a collection of workers for invalidation.
Invalidation signalling
Add a trigger on the table that's being updated that sends a NOTIFY whenever an entity is updated. On PostgreSQL 9.0 and above this NOTIFY can contain a payload, usually a row ID, so you don't have to invalidate your entire cache, just the entity that has changed. On older versions where a payload isn't supported you can either add the invalidated entries to a timestamped log table that your helper class queries when it gets a NOTIFY, or just invalidate the whole cache.
Your helper class now LISTENs on the NOTIFY events the trigger sends. When it gets a NOTIFY event, it can invalidate individual cache entries (see below), or flush the entire cache. You can listen for notifications from the database with PgJDBC's listen/notify support. You will need to unwrap any connection pooler managed java.sql.Connection to get to the underlying PostgreSQL implementation so you can cast it to org.postgresql.PGConnection and call getNotifications() on it.
An an alternative to LISTEN and NOTIFY, you could poll a change log table on a timer, and have a trigger on the problem table append changed row IDs and change timestamps to the change log table. This approach will be portable except for the need for a different trigger for each DB type, but it's inefficient and less timely. It'll require frequent inefficient polling, and still have a time delay that the listen/notify approach does not. In PostgreSQL you can use an UNLOGGED table to reduce the costs of this approach a little bit.
Cache levels
EclipseLink/JPA has a couple of levels of caching.
The 1st level cache is at the EntityManager level. If an entity is attached to an EntityManager by persist(...), merge(...), find(...), etc, then the EntityManager is required to return the same instance of that entity when it is accessed again within the same session, whether or not your application still has references to it. This attached instance won't be up-to-date if your database contents have since changed.
The 2nd level cache, which is optional, is at the EntityManagerFactory level and is a more traditional cache. It isn't clear whether you have the 2nd level cache enabled. Check your EclipseLink logs and your persistence.xml. You can get access to the 2nd level cache with EntityManagerFactory.getCache(); see Cache.
#thedayofcondor showed how to flush the 2nd level cache with:
em.getEntityManagerFactory().getCache().evictAll();
but you can also evict individual objects with the evict(java.lang.Class cls, java.lang.Object primaryKey) call:
em.getEntityManagerFactory().getCache().evict(theClass, thePrimaryKey);
which you can use from your #Startup #Singleton NOTIFY listener to invalidate only those entries that have changed.
The 1st level cache isn't so easy, because it's part of your application logic. You'll want to learn about how the EntityManager, attached and detached entities, etc work. One option is to always use detached entities for the table in question, where you use a new EntityManager whenever you fetch the entity. This question:
Invalidating JPA EntityManager session
has a useful discussion of handling invalidation of the entity manager's cache. However, it's unlikely that an EntityManager cache is your problem, because a RESTful web service is usually implemented using short EntityManager sessions. This is only likely to be an issue if you're using extended persistence contexts, or if you're creating and managing your own EntityManager sessions rather than using container-managed persistence.

You can either disable caching entirely (see: http://wiki.eclipse.org/EclipseLink/FAQ/How_to_disable_the_shared_cache%3F ) but be preparedto a fairly large performance loss.
Otherwise, you can perform a clear cache programmatically with
em.getEntityManagerFactory().getCache().evictAll();
You can map it to a servlet so you can call it externally - this is better if your database is modify externally very seldom and you just want to be sure JPS will pick up the new version

Just a thought, but how do you receive your EntityManager/Session/whatever?
If you queried the entity in one session, it will be detached in the next one and you will have to merge it back into the persistence context to get it managed again.
Trying to work with detached entities may result in those not-managed exceptions, you should re-query the entity or you could try it with merge (or similar methods).

JPA doesn't do any caching by default. You have to explicitly configure it. I believe its the side effect of the architectural style you have chosen: REST. I think caching is happening at the web servers, proxy servers etc. I suggest you read this and debug more.

Does Java EE security model support ACL?

I used Java EE 6 with Glassfish v3.0.1, and I wonder if Java EE security model support ACL, and if so how fine-grained is it get?
EDITED
I implement Security using jdbc realm via glassfish v3, that the realm at runtime look into table USER inside the database to check for authentication, by looking at the password field and authorization by looking at the role field. The roles field only contain 2 either ADMINISTRATOR or DESIGNER. So it is a One-to-one map between user and role. At the managed bean level, I implemented this
private Principal getLoggedInUser()
{
HttpServletRequest request =
(HttpServletRequest) FacesContext.getCurrentInstance().
getExternalContext().getRequest();
if(request.isUserInRole("ADMINISTRATORS")){
admin = true;
}else{
admin = false;
}
return request.getUserPrincipal();
}
public boolean isUserNotLogin()
{
Principal loginUser = getLoggedInUser();
if (loginUser == null)
{
return true;
}
return false;
}
public String getLoginUserName()
{
Principal loginUser = getLoggedInUser();
if (loginUser != null)
{
return loginUser.getName();
}
return "None";
}
by calling isUserInRole, I can determine if the user is admin or not, then the JSF will render the content appropriately. However, that is not fine-grained enough (real quick background info: There are multiple projects, a project contains multiple drawings). Because if u are a DESIGNER, you can see all the drawings from all the projects (what if I only want tom to work on project A, while peter will work on project B, Cindy can supervised over the two project A and B). I want that, at runtime, when I create the user, I can specifically set what project can he/she see. Is there a way to accomplish this? NOTE: There are more than just two projects, the above example is just for demonstration.

The Java EE security model authenticates a 'Principal' which may one have or more 'Roles'.
In the other dimension you have services and resources which need configurable 'Permissions' or 'Capabilities'.
In the configuration you determine which 'Principals' or 'Roles' have which 'Permissions' or 'Capabilities'.
In other words, yes it supports ACL and it is as fine grained as you want it to be, but you'll have to get used to the terminology.
In the answer of Vineet is the excellent suggestion to create 'roles' per project id. Since people must be assigned to projects anyhow, it is straightforward to to add the people to these groups at that time. Alternatively a timed script can update the group memberships based on the roles. The latter approach can be preferable, because it is easier to verify security if these decisions are in one place instead of scattered all over the administration code.
Alternatively you can use "coarse-grained" roles e.g. designer and make use of the database (or program logic) to restrict the views for the user logged in
SELECT p.* FROM projects p, assignments a WHERE p.id = a.projectId AND a.finishdate < NOW();
or
#Stateless class SomeThing {
#Resource SessionContext ctx;
#RolesAllowed("DESIGNER")
public void doSomething(Project project) {
String userName = ctx.getCallerPrincipal.getName();
if (project.getTeamMembers().contains(userName) {
// do stuff
}
}
}
Note that the coarse grained access control has here been done with an annotation instead of code. This can move a lot of hard to test boilerplate out of the code and save a lot of time.
There are similar features to render webpages where you can render parts of the screen based on the current user using a tag typically.
Also because security is such a wide reaching concern, I think it is better to use the provided features to get at the context than to pass a battery of boolean flags like isAdmin around as this quickly becomes very messy. It increases coupling and it is another thing making the classes harder to unit-test.
In many JSF implementations there are tags which can help rendering optional things. Here is are examples for richfaces and seam:
<!-- richfaces -->
<rich:panel header="Admin panel" rendered="#{rich:isUserInRole('admin')}">
Very sensitive information
</rich:panel>
<!-- seam -->
<h:commandButton value="edit" rendered="#{isUserInRole['admin']}"/>.
Here is an article explaining how to add it to ADF

The Java EE security model implements RBAC (Role Based Access Control). To a Java EE programmer, this effectively means that permissions to access a resource can be granted to users. Resources could include files, databases, or even code. Therefore, it is possible to not only restrict access to objects like files and tables in databases, it is also possible to restrict access to executable code.
Now, permissions can be grouped together into roles that are eventually linked to users/subjects. This is the Java EE security model in a nutshell.
From the description of your problem, it appears that you wish to distinguish between two different projects as two different resources, and therefore have either two separate permission objects or two separate roles to account for the same. Given that you already have roles (more appropriately termed as user groups) like Administrator, Designer etc. this cannot be achieved in quite easily in Java EE. The reason is that you are distinguishing access to resources to users in a role, based on an additional property of the resource - the project ID. This technically falls into the area known as ABAC (Attribute Based Access Control).
One way of achieving ABAC in Java EE is to carry the properties/attributes granted to the role, in the role name. So instead of the following code:
if(request.isUserInRole("DESIGNERS")){
access = true;
}else{
access = false;
}
you ought to doing something like the following. Note the ":" character used as a separator to distinguish the role name from the accompanying attribute.
if(request.isUserInRole("DESIGNERS"+":"+projectId)){
access = true;
}else{
access = false;
}
Of course, there is the part where your login module should be modified (either in configuration or in code) to return Roles containing project IDs, instead of plain role names. Do note that all of these suggested changes need to reviewed comprehensively for issues - for instance, one should be disallowing the separator character from being part of a role name, otherwise it is quite possible to perform privilege escalation attacks.
If implementing the above proves to be a handful, you could look at systems like Shibboleth that provide support for ABAC, although I've never seen it being used in a Java EE application.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.