Android Firestore limitations to custom object models

Android Firestore limitations to custom object models - java

I am migrating my app to use Firebase Firestore, and one of my models is very complex (contains lists of other custom objects). Looking at the documentation, on how to commit a model object as a document, it looks like you simply create your model object with a public constructor, and getters and setters.
For example from the add data guide:
public class City {
private String name;
private String state;
private String country;
private boolean capital;
private long population;
private List<String> regions;
public City() {}
public City(String name, String state, String country, boolean capital, long population, List<String> regions) {
// getters/setters
}
Firestore automatically translates this to and from and document without any additional steps. You pass an instance to a DocumentReference.set(city) call, and retrieve it from a call to DocumentSnapshot.toObject(City.class)
How exactly does it serialize this to a document? Through reflection? It doesn't discuss any limitations. Basically, I'm left wondering if this will work on more complex models, and how complex. Will it work for a class with an ArrayList of custom objects?

Firestore automatically translates this to and from and document without any additional steps. How exactly does it serialize this to a document? Through reflection?
You're guessing right, through reflection. As also #Doug Stevenson mentioned in his comment, that's very common for systems as Firebase, to convert JSON data to POJO (Plain Old Java Object). Please also note that the setters are not required. If there is no setter for a JSON property, the Firebase client will set the value directly onto the field. A constructor-with-arguments is also not required. While both are idiomatic, there are good cases to have classes without them. Please also take a look at some informations regarding the existens fo the no-argument constructor.
It doesn't discuss any limitations.
Yes it does. The official documentation explains that the documents have limits. So there are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text, you can store pretty much but as your array getts bigger (with custom objects), be careful about this limitation.
Please also note, that if you are storing large amount of data in arrays and those arrays should be updated by lots of users, there is another limitation that you need to take care of. So you are limited to 1 write per second on every document. So if you have a situation in which a lot of users al all trying to write/update data to the same documents all at once, you might start to see some of this writes to fail. So, be careful about this limitation too.
Will it work for a class with an ArrayList of custom objects?
It will work with any types of classes as long as are supported data type objects.
Basically, I'm left wondering if this will work on more complex models, and how complex.
It will work with any king of complex model as long as you are using the correct data types for your objects and your documents are within that 1 MIB limitation.

Related

Indexing a simple Java Record

I have a Java Object, Record . It represents a single record as a result of SQL execution. Can CQEngine index collection of Record ?
My class is of the form
public class Record {
private List<String> columnNames;
private List<Object> values;
... Other getters
}
I have looked through some examples, but I have no luck there.
I want to index only specific column(s) with its name and corresponding value. Can this be achived using cqengine or is there any other alternatives to achieve the same.
Thanks.

That seems to be a strange way to model data, but you can use CQEngine with that model if you wish.
(First off, CQEngine will have no use for your column names so you can remove that field.)
To do this, you will need to define a CQEngine virtual attribute for each of the indexes in your list of values.
Each attribute will need to be declared with the data type which will be stored in that column/index, and will need to be able to cast the object at that index in your list of values, to the appropriate data type (String, Double, Integer etc.).
So let's say your Record has a column called 'price', which is of type Double, and is stored at index 5 in the list of values. You could define an attribute which reads it as follows:
public static final Attribute<Record, Double> PRICE =
attribute("PRICE", record -> ((Double) record.values.get(5));
If this sounds complicated, it's because that way of modelling data makes things a bit complicated :) It's usually easier to work with a data model which leverages the Java type system (which your model does not). As such, you will need to keep track of the data types etc. of each field programmatically yourself.
CQEngine itself will work fine with that model though, because at the end of the day CQEngine attributes don't need to read fields, the attributes are just functions which are programmed to fetch values.
There's a bunch of stuff not covered above. For example can your values be null? (if so, you should use the nullable variety of attributes as discussed in the CQEngine docs. Or, might each of your Record objects have different sets of columns? (if so, you can create attributes on-the-fly when you encounter a new column, but you should probably cache the attributes you have created somewhere).
Hope that helps,
Niall (CQEngine author)

Firebase Java POJOs and local-only fields

Given the following POJO example which stores local fields applicable only to the app running right here, nobody else whom also use the Firebase data:
#IgnoreExtraProperties
public class DispatchModel {
private String isotimestamp;
private String date_time;
private String event_uuid;
//...
private String locallyCreatedValue;
//...constructor, getters, setters
}
Given my Firebase data has the first three fields stored in it, and the locallyCreatedValue is added to my POJO during app runtime, is there a way to automatically fetch all the locally added content from a POJO instance and apply to the new instance when an update from onChildChanged event happens?
As it is right now, I'll have to manually obtain all the local field values and set them on the new instance:
#Override
public void onChildChanged(DataSnapshot dataSnapshot, String s) {
DispatchModel newModel = dataSnapshot.getValue(DispatchModel.class);
// get list index of expiring instance
// get instance of old item at list index
// index = ...
// old = ...
// repeat this for every local item :-/
newModel.setLocallyCreatedValue(old.getLocallyCreatedValue);
dispatchList.set(index, newModel);
}
I plan on having quite a few local fields, is this my only option? Are there any functions Firebase offers that makes their automatic object instantiation more friendly to my extras? I'm not keen on creating distinct POJOs to track the Firebase POJOs in parallel. That lends to data errors from decoupled data updates and careful schedules for execution.

If all of your locally created values are expressed as standard getters and setters in your POJO, you are either going to have to copy them manually, or write some fairly intense Java reflection code to inspect the class, somehow figure out which properties are local (annotation? inclusion/exclusion from a known list?) and should be copied over, then actually do that work. You will not be able to get this "for free" using some utility (unless there happens to be some third party library that has solved this specific problem, and I doubt it).
Consider instead maybe storing your locally created values as a Map, then simply copying that map wholesale between objects. But then you have unknown types for the values if they're all different.
Or rewrite your code in JavaScript, which has easy property enumeration. :-)

DTOs with different granularity

I'm on a project that uses the latest Spring+Hibernate for persistence and for implementing a REST API.
The different tables in the database contain lots of records which are in turn pretty big as well. So, I've created a lot of DAOs to retrieve different levels of detail and their accompanying DTOs.
For example, if I have some Employee table in the database that contains tons of information about each employee. And if I know that any client using my application would benefit greatly from retrieving different levels of detail of an Employee entity (instead of being bombarded by the entire entity every time), what I've been doing so far is something like this:
class EmployeeL1DetailsDto
{
String id;
String firstName;
String lastName;
}
class EmployeeL2DetailsDto extends EmployeeL1DetailsDto
{
Position position;
Department department;
PhoneNumber workPhoneNumber;
Address workAddress;
}
class EmployeeL3DetailsDto extends EmployeeL2DetailsDto
{
int yearsOfService;
PhoneNumber homePhoneNumber;
Address homeAddress;
BidDecimal salary;
}
And So on...
Here you see that I've divided the Employee information into different levels of detail.
The accompanying DAO would look something like this:
class EmployeeDao
{
...
public List<EmployeeL1DetailsDto> getEmployeeL1Detail()
{
...
// uses a criteria-select query to retrieve only L1 columns
return list;
}
public List<EmployeeL2DetailsDto> getEmployeeL2Detail()
{
...
// uses a criteria-select query to retrieve only L1+L2 columns
return list;
}
public List<EmployeeL3DetailsDto> getEmployeeL3Detail()
{
...
// uses a criteria-select query to retrieve only L1+L2+L3 columns
return list;
}
.
.
.
// And so on
}
I've been using hibernate's aliasToBean() to auto-map the retrieved Entities into the DTOs. Still, I feel the amount of boiler-plate in the process as a whole (all the DTOs, DAO methods, URL parameters for the level of detail wanted, etc.) are a bit worrying and make me think there might be a cleaner approach to this.
So, my question is: Is there a better pattern to follow to retrieve different levels of detail from a persisted entity?
I'm pretty new to Spring and Hibernate, so feel free to point anything that is considered basic knowledge that you think I'm not aware of.
Thanks!

I would go with as little different queries as possible. I would rather make associations lazy in my mappings, and then let them be initialized on demand with appropriate Hibernate fetch strategies.
I think that there is nothing wrong in having multiple different DTO classes per one business model entity, and that they often make the code more readable and maintainable.
However, if the number of DTO classes tends to explode, then I would make a balance between readability (maintainability) and performance.
For example, if a DTO field is not used in a context, I would leave it as null or fill it in anyway if that is really not expensive. Then if it is null, you could instruct your object marshaller to exclude null fields when producing REST service response (JSON, XML, etc) if it really bothers the service consumer. Or, if you are filling it in, then it's always welcome later when you add new features in the application and it starts being used in a context.

You will have to define in one way or another the different granularity versions. You can try to have subobjects that are not loaded/set to null (as recommended in other answers), but it can easily get quite awkward, since you will start to structure your data by security concerns and not by domain model.
So doing it with individual classes is after all not such a bad approach.
You might want to have it more dynamic (maybe because you want to extend even your data model on db side with more data).
If that's the case you might want to move the definition out from code to some configurations (could even be dynamic at runtime). This will of course require a dynamic data model also on Java side, like using a hashmap (see here on how to do that). You gain thereby a dynamic data model, but loose the type safety (at least to a certain extend). In other languages that probably would feel natural but in Java it's less common.
It would now be up to your HQL to define on how you want to populate your object.
The path you want to take depends now a lot on the context, how your object will get used

Another approach is to use only domain objects at Dao level, and define the needed subsets of information as DTO for each usage. Then convert the Employee entity to each DTO's using the Generic DTO converter, as I have used lately in my professional Spring activities. MIT-licenced module is available at Maven repository artifact dtoconverter .
and further info and user guidance at author's Wiki:
http://ratamaa.fi/trac/dtoconverter
Quickest idea you get from the example page there:
Happy hunting...

Blaze-Persistence Entity Views have been created for exactly such a use case. You define the DTO structure as interface or abstract class and have mappings to your entity's attributes. When querying, you just pass in the class and the library will take care of generating an optimized query for the projection.
Here a quick example
#EntityView(Cat.class)
public interface CatView {
#IdMapping("id")
Integer getId();
String getName();
}
CatView is the DTO definition and here comes the querying part
CriteriaBuilder<Cat> cb = criteriaBuilderFactory.create(entityManager, Cat.class);
cb.from(Cat.class, "theCat")
.where("father").isNotNull()
.where("mother").isNotNull();
EntityViewSetting<CatView, CriteriaBuilder<CatView>> setting = EntityViewSetting.create(CatView.class);
List<CatView> list = entityViewManager
.applySetting(setting, cb)
.getResultList();
Note that the essential part is that the EntityViewSetting has the CatView type which is applied onto an existing query. The generated JPQL/HQL is optimized for the CatView i.e. it only selects(and joins!) what it really needs.
SELECT
theCat.id,
theCat.name
FROM
Cat theCat
WHERE theCat.father IS NOT NULL
AND theCat.mother IS NOT NULL

Key-Value on top of Appengine

Although appengine already is schema-less, there still need to define the entities that needed to be stored into the Datastore through the Datanucleus persistence layer. So I am thinking of a way to get around this; by having a layer that will store Key-value at runtime, instead of compile-time Entities.
The way this is done with Redis is by creating a key like this:
private static final String USER_ID_FORMAT = "user:id:%s";
private static final String USER_NAME_FORMAT = "user:name:%s";
From the docs Redis types are: String, Linked-list, Set, Sorted set. I am not sure if there's more.
As for the GAE datastore is concerned a String "Key" and a "Value" have to be the entity that will be stored.
Like:
public class KeyValue {
private String key;
private Value value; // value can be a String, Linked-list, Set or Sorted set etc.
// Code omitted
}
The justification of this scheme is rooted to the Restful access to the datastore (that is provided by Datanucleus-api-rest)
Using this rest api, to persist a object or entity:
POST http://datanucleus.appspot.com/dn/guestbook.Greeting
{"author":null,
"class":"guestbook.Greeting",
"content":"test insert",
"date":1239213923232}
The problem with this approach is that in order to persist a Entity the actual class needs to be defined at compile time; unlike with the idea of having a key-value store mechanism we can simplify the method call:
POST http://datanucleus.appspot.com/dn/org.myframework.KeyValue
{ "class":"org.myframework.KeyValue"
"key":"user:id:johnsmith;followers",
"value":"the_list",
}
Passing a single string as "value" is fairly easy, I can use JSON array for list, set or sorted list. The real question would be how to actually persist different types of data passed into the interface. Should there be multiple KeyValue entities each representing the basic types it support: KeyValueString? KeyValueList? etc.

Looks like you're using a JSON based REST API, so why not just store Value as a JSON string?

You do not need to use the Datanucleus layer, or any of the other fine ORM layers (like Twig or Objectify). Those are optional, and are all based on the low-level API. If I interpret what you are saying properly, perhaps it already has the functionality that you want. See: https://developers.google.com/appengine/docs/java/datastore/entities

Datanucleus is a specific framework that runs on top of GAE. You can however access the database at a lower, less structured, more key/value-like level - the low-level API. That's the lowest level you can access directly.
BTW, the low-level-"GAE datastore" internally runs on 6 global Google Megastore tables, which in turn are hosted on the Google Big Table database system.
Saving JSON as a String works fine. But you will need ways to retrieve your objects other than by ID. That is, you need a way to index your data to support any kind of useful query on it.

solrj: how to store and retrieve List<POJO> via multivalued field in index

My use case is an index which holds titles of online media. The provider of the data associates a list of categories with each title. I am using SolrJ to populate the index via an annotated POJO class
e.g.
#Field("title")
private String title;
#Field("categories")
private List<Category> categoryList;
The associated POJO is
public class Category {
private Long id;
private String name;
...
}
My question has two parts:
a) is this possible via SolrJ - the docs only contain an example of #Field using a List of String, so I assume the serialization/marshalling only supports simple types ?
b) how would I set up the schema to hold this. I have a naive assumption I just need to set
multiValued=true on the required field & it will all work by magic.
I'm just starting to implement this so any response would be highly appreciated.

The answer is as you thought:
a) You have only simple types available. So you will have a List of the same type e.g. String. The point is you cant represent complex types inside the lucene document so you wont deserialize them as well.
b) The problem is what you are trying is to represent relational thinking in a "document store". That will probably work only to a certain point. If you want to represent categories inside a lucene document just use the string it is not necessary to store a id as well.
The only point to store an id as well is: if you want to do aside the search a lookup on a RDBMS. If you want to do this you need to make sure that the id and the category name is softlinked. This is not working for every 1:n relation. (Every 1:n relation where the n related table consists only of required fields is possible. If you have an optional field you need to put something like a filling emptyconstant in the field if possible).
However if these 1:n relations are not sparse its possible actually if you maintain the order in which you add fields to the document. So the case with the category relation can be probably represented if you dont sort the lists.
You may implement a method which returns this Category if you instantiate it with the values at position 0...n. So the solution would be if you want to have the first category it will be at position 0 of every list related to this category.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.