Key-Value on top of Appengine

Key-Value on top of Appengine - java

Although appengine already is schema-less, there still need to define the entities that needed to be stored into the Datastore through the Datanucleus persistence layer. So I am thinking of a way to get around this; by having a layer that will store Key-value at runtime, instead of compile-time Entities.
The way this is done with Redis is by creating a key like this:
private static final String USER_ID_FORMAT = "user:id:%s";
private static final String USER_NAME_FORMAT = "user:name:%s";
From the docs Redis types are: String, Linked-list, Set, Sorted set. I am not sure if there's more.
As for the GAE datastore is concerned a String "Key" and a "Value" have to be the entity that will be stored.
Like:
public class KeyValue {
private String key;
private Value value; // value can be a String, Linked-list, Set or Sorted set etc.
// Code omitted
}
The justification of this scheme is rooted to the Restful access to the datastore (that is provided by Datanucleus-api-rest)
Using this rest api, to persist a object or entity:
POST http://datanucleus.appspot.com/dn/guestbook.Greeting
{"author":null,
"class":"guestbook.Greeting",
"content":"test insert",
"date":1239213923232}
The problem with this approach is that in order to persist a Entity the actual class needs to be defined at compile time; unlike with the idea of having a key-value store mechanism we can simplify the method call:
POST http://datanucleus.appspot.com/dn/org.myframework.KeyValue
{ "class":"org.myframework.KeyValue"
"key":"user:id:johnsmith;followers",
"value":"the_list",
}
Passing a single string as "value" is fairly easy, I can use JSON array for list, set or sorted list. The real question would be how to actually persist different types of data passed into the interface. Should there be multiple KeyValue entities each representing the basic types it support: KeyValueString? KeyValueList? etc.

Looks like you're using a JSON based REST API, so why not just store Value as a JSON string?

You do not need to use the Datanucleus layer, or any of the other fine ORM layers (like Twig or Objectify). Those are optional, and are all based on the low-level API. If I interpret what you are saying properly, perhaps it already has the functionality that you want. See: https://developers.google.com/appengine/docs/java/datastore/entities

Datanucleus is a specific framework that runs on top of GAE. You can however access the database at a lower, less structured, more key/value-like level - the low-level API. That's the lowest level you can access directly.
BTW, the low-level-"GAE datastore" internally runs on 6 global Google Megastore tables, which in turn are hosted on the Google Big Table database system.
Saving JSON as a String works fine. But you will need ways to retrieve your objects other than by ID. That is, you need a way to index your data to support any kind of useful query on it.

Related

Android Firestore limitations to custom object models

I am migrating my app to use Firebase Firestore, and one of my models is very complex (contains lists of other custom objects). Looking at the documentation, on how to commit a model object as a document, it looks like you simply create your model object with a public constructor, and getters and setters.
For example from the add data guide:
public class City {
private String name;
private String state;
private String country;
private boolean capital;
private long population;
private List<String> regions;
public City() {}
public City(String name, String state, String country, boolean capital, long population, List<String> regions) {
// getters/setters
}
Firestore automatically translates this to and from and document without any additional steps. You pass an instance to a DocumentReference.set(city) call, and retrieve it from a call to DocumentSnapshot.toObject(City.class)
How exactly does it serialize this to a document? Through reflection? It doesn't discuss any limitations. Basically, I'm left wondering if this will work on more complex models, and how complex. Will it work for a class with an ArrayList of custom objects?

Firestore automatically translates this to and from and document without any additional steps. How exactly does it serialize this to a document? Through reflection?
You're guessing right, through reflection. As also #Doug Stevenson mentioned in his comment, that's very common for systems as Firebase, to convert JSON data to POJO (Plain Old Java Object). Please also note that the setters are not required. If there is no setter for a JSON property, the Firebase client will set the value directly onto the field. A constructor-with-arguments is also not required. While both are idiomatic, there are good cases to have classes without them. Please also take a look at some informations regarding the existens fo the no-argument constructor.
It doesn't discuss any limitations.
Yes it does. The official documentation explains that the documents have limits. So there are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text, you can store pretty much but as your array getts bigger (with custom objects), be careful about this limitation.
Please also note, that if you are storing large amount of data in arrays and those arrays should be updated by lots of users, there is another limitation that you need to take care of. So you are limited to 1 write per second on every document. So if you have a situation in which a lot of users al all trying to write/update data to the same documents all at once, you might start to see some of this writes to fail. So, be careful about this limitation too.
Will it work for a class with an ArrayList of custom objects?
It will work with any types of classes as long as are supported data type objects.
Basically, I'm left wondering if this will work on more complex models, and how complex.
It will work with any king of complex model as long as you are using the correct data types for your objects and your documents are within that 1 MIB limitation.

Firebase Java POJOs and local-only fields

Given the following POJO example which stores local fields applicable only to the app running right here, nobody else whom also use the Firebase data:
#IgnoreExtraProperties
public class DispatchModel {
private String isotimestamp;
private String date_time;
private String event_uuid;
//...
private String locallyCreatedValue;
//...constructor, getters, setters
}
Given my Firebase data has the first three fields stored in it, and the locallyCreatedValue is added to my POJO during app runtime, is there a way to automatically fetch all the locally added content from a POJO instance and apply to the new instance when an update from onChildChanged event happens?
As it is right now, I'll have to manually obtain all the local field values and set them on the new instance:
#Override
public void onChildChanged(DataSnapshot dataSnapshot, String s) {
DispatchModel newModel = dataSnapshot.getValue(DispatchModel.class);
// get list index of expiring instance
// get instance of old item at list index
// index = ...
// old = ...
// repeat this for every local item :-/
newModel.setLocallyCreatedValue(old.getLocallyCreatedValue);
dispatchList.set(index, newModel);
}
I plan on having quite a few local fields, is this my only option? Are there any functions Firebase offers that makes their automatic object instantiation more friendly to my extras? I'm not keen on creating distinct POJOs to track the Firebase POJOs in parallel. That lends to data errors from decoupled data updates and careful schedules for execution.

If all of your locally created values are expressed as standard getters and setters in your POJO, you are either going to have to copy them manually, or write some fairly intense Java reflection code to inspect the class, somehow figure out which properties are local (annotation? inclusion/exclusion from a known list?) and should be copied over, then actually do that work. You will not be able to get this "for free" using some utility (unless there happens to be some third party library that has solved this specific problem, and I doubt it).
Consider instead maybe storing your locally created values as a Map, then simply copying that map wholesale between objects. But then you have unknown types for the values if they're all different.
Or rewrite your code in JavaScript, which has easy property enumeration. :-)

DTOs with different granularity

I'm on a project that uses the latest Spring+Hibernate for persistence and for implementing a REST API.
The different tables in the database contain lots of records which are in turn pretty big as well. So, I've created a lot of DAOs to retrieve different levels of detail and their accompanying DTOs.
For example, if I have some Employee table in the database that contains tons of information about each employee. And if I know that any client using my application would benefit greatly from retrieving different levels of detail of an Employee entity (instead of being bombarded by the entire entity every time), what I've been doing so far is something like this:
class EmployeeL1DetailsDto
{
String id;
String firstName;
String lastName;
}
class EmployeeL2DetailsDto extends EmployeeL1DetailsDto
{
Position position;
Department department;
PhoneNumber workPhoneNumber;
Address workAddress;
}
class EmployeeL3DetailsDto extends EmployeeL2DetailsDto
{
int yearsOfService;
PhoneNumber homePhoneNumber;
Address homeAddress;
BidDecimal salary;
}
And So on...
Here you see that I've divided the Employee information into different levels of detail.
The accompanying DAO would look something like this:
class EmployeeDao
{
...
public List<EmployeeL1DetailsDto> getEmployeeL1Detail()
{
...
// uses a criteria-select query to retrieve only L1 columns
return list;
}
public List<EmployeeL2DetailsDto> getEmployeeL2Detail()
{
...
// uses a criteria-select query to retrieve only L1+L2 columns
return list;
}
public List<EmployeeL3DetailsDto> getEmployeeL3Detail()
{
...
// uses a criteria-select query to retrieve only L1+L2+L3 columns
return list;
}
.
.
.
// And so on
}
I've been using hibernate's aliasToBean() to auto-map the retrieved Entities into the DTOs. Still, I feel the amount of boiler-plate in the process as a whole (all the DTOs, DAO methods, URL parameters for the level of detail wanted, etc.) are a bit worrying and make me think there might be a cleaner approach to this.
So, my question is: Is there a better pattern to follow to retrieve different levels of detail from a persisted entity?
I'm pretty new to Spring and Hibernate, so feel free to point anything that is considered basic knowledge that you think I'm not aware of.
Thanks!

I would go with as little different queries as possible. I would rather make associations lazy in my mappings, and then let them be initialized on demand with appropriate Hibernate fetch strategies.
I think that there is nothing wrong in having multiple different DTO classes per one business model entity, and that they often make the code more readable and maintainable.
However, if the number of DTO classes tends to explode, then I would make a balance between readability (maintainability) and performance.
For example, if a DTO field is not used in a context, I would leave it as null or fill it in anyway if that is really not expensive. Then if it is null, you could instruct your object marshaller to exclude null fields when producing REST service response (JSON, XML, etc) if it really bothers the service consumer. Or, if you are filling it in, then it's always welcome later when you add new features in the application and it starts being used in a context.

You will have to define in one way or another the different granularity versions. You can try to have subobjects that are not loaded/set to null (as recommended in other answers), but it can easily get quite awkward, since you will start to structure your data by security concerns and not by domain model.
So doing it with individual classes is after all not such a bad approach.
You might want to have it more dynamic (maybe because you want to extend even your data model on db side with more data).
If that's the case you might want to move the definition out from code to some configurations (could even be dynamic at runtime). This will of course require a dynamic data model also on Java side, like using a hashmap (see here on how to do that). You gain thereby a dynamic data model, but loose the type safety (at least to a certain extend). In other languages that probably would feel natural but in Java it's less common.
It would now be up to your HQL to define on how you want to populate your object.
The path you want to take depends now a lot on the context, how your object will get used

Another approach is to use only domain objects at Dao level, and define the needed subsets of information as DTO for each usage. Then convert the Employee entity to each DTO's using the Generic DTO converter, as I have used lately in my professional Spring activities. MIT-licenced module is available at Maven repository artifact dtoconverter .
and further info and user guidance at author's Wiki:
http://ratamaa.fi/trac/dtoconverter
Quickest idea you get from the example page there:
Happy hunting...

Blaze-Persistence Entity Views have been created for exactly such a use case. You define the DTO structure as interface or abstract class and have mappings to your entity's attributes. When querying, you just pass in the class and the library will take care of generating an optimized query for the projection.
Here a quick example
#EntityView(Cat.class)
public interface CatView {
#IdMapping("id")
Integer getId();
String getName();
}
CatView is the DTO definition and here comes the querying part
CriteriaBuilder<Cat> cb = criteriaBuilderFactory.create(entityManager, Cat.class);
cb.from(Cat.class, "theCat")
.where("father").isNotNull()
.where("mother").isNotNull();
EntityViewSetting<CatView, CriteriaBuilder<CatView>> setting = EntityViewSetting.create(CatView.class);
List<CatView> list = entityViewManager
.applySetting(setting, cb)
.getResultList();
Note that the essential part is that the EntityViewSetting has the CatView type which is applied onto an existing query. The generated JPQL/HQL is optimized for the CatView i.e. it only selects(and joins!) what it really needs.
SELECT
theCat.id,
theCat.name
FROM
Cat theCat
WHERE theCat.father IS NOT NULL
AND theCat.mother IS NOT NULL

Volatile data in Solr

I have an index of documents which is distributed over several shards and replicas. The size is ca. 40 mil and I expect it to grow
Problem: Users add information to these documents, which they change quite frequently. They need it to be integrated in search syntax, e.g. funny and cool and cat:interesting. Where cat would be a volatile data set
As far as I know neither Solr nor Lucene support "true update", that means that I have to reindex the whole set of changed documents again. Thus I need to connect it to external data source such as relational database.
I did it in Lucene with extendable search (http://lucene.apache.org/core/4_3_0/queryparser/index.html). The algorithm was pretty easy:
Preprosess query by adding "_" to all external fields
Map these fields to classes
Each class extends org.apache.lucene.search.Filter class and converts ids to a bitset by overriding public public DocIdSet getDocIdSet(AtomicReaderContext context, Bits acceptDocs) throws IOException:
ResultSet set = state.executeQuery();
OpenBitSet bitset = new OpenBitSet();
while (set.next()) {
bitset.set(set.getInt("ID"));
}
Then by extending org.apache.lucene.queryparser.ext.ParserExtension, I override parse like this:
public Query parse(ExtensionQuery eq) throws ParseException{
String cat= eq.getRawQueryString();
Filter filter = _cache.getFilter(cat);
return new ConstantScoreQuery(filter);
}
Extend org.apache.lucene.queryparser.ext.Extensions using add method and done.
But HOW to do this in Solr?
I found couple of suggestions:
Using external field (http://lucene.apache.org/solr/4_3_0/solr-core/org/apache/solr/schema/ExternalFileField.html)
NRS (http://wiki.apache.org/solr/NearRealtimeSearch) which looks a little bit under construction to me.
Any ideas how to do it in Solr? Maybe there are some code examples?
Please, consider also that Im kinda new to Solr.
Thank you

The Solr 4.x releases all support Atomic Update which I believe may satisfy your needs.

Doctrine-like array access for Java

I plan to move form PHP to Java writing data-driven web apps. I obviously want to have a layer handling persistent data. In PHP with Doctrine (1.x) the following things can be done thru a single interface (PHP's ArrayAccess):
Representing data structures in code
Getting structured data from the database thru Doctrine
Representing structured data in an HTML form
So it is essential that I can have a layer for forms like:
$properties = array (
"minlength" => 2,
"maxlength" => 30,
);
new TextInput ("name", $properties);
... which is oblivious about the underlaying mechanics. It can load and save (possibly structured) data from all the sources above thru a single interface.
When saving data to a record it can not call setName($value). It can only call set("name", $value). (Of course it could be done thru reflection, but I hope I don't have to elaborate on why it's a bad idea).
So is there any ORM in Java which:
Implements the native collection interfaces. java.util.Map for example.
Maps DB relations as collections like author.get("books").put(newBook)
Has the right triggers to implement complex logic (like permissions or external files attached to fields).

Map access for POJO classess can be achieved thru a superclass implementing Map thru Hibernate's ClassMetadata interface like:
abstract class MappedRecord implements java.util.Map<String, Object> {
private ClassMetadata classMeta;
public MappedRecord() {
classMeta = mySessionFactory.getClassMetadata(this.getClass());
}
public Object put(String s, Object o) {
classMeta.setPropertyValue(this, s, o, EntityMode.POJO);
}
}
Then when you extend MappedRecord in your persistent classes, you can call:
User u = new User();
u.put("name", "John");
Safely getting mySessionFactory is a tricky question though;

You may want to have a look into Hibernate and JPA

I think NHibernate is the choice, but I'm not sure I got your requirement about triggers. I think, it's a bit application layer, not ORM layer.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.